mbox series

[v8,0/8] add DSC 1.2 dpu supports

Message ID 1683914423-17612-1-git-send-email-quic_khsieh@quicinc.com
Headers show
Series add DSC 1.2 dpu supports | expand

Message

Kuogee Hsieh May 12, 2023, 6 p.m. UTC
This series adds the DPU side changes to support DSC 1.2 encoder. This
was validated with both DSI DSC 1.2 panel and DP DSC 1.2 monitor.
The DSI and DP parts will be pushed later on top of this change.
This seriel is rebase on [1], [2] and catalog fixes from rev-4 of [3].

[1]: https://patchwork.freedesktop.org/series/116851/
[2]: https://patchwork.freedesktop.org/series/116615/
[3]: https://patchwork.freedesktop.org/series/112332/

Abhinav Kumar (2):
  drm/msm/dpu: add dsc blocks for remaining chipsets in catalog
  drm/msm/dpu: add DSC 1.2 hw blocks for relevant chipsets

Kuogee Hsieh (6):
  drm/msm/dpu: add DPU_PINGPONG_DSC feature bit for DPU < 7.0.0
  drm/msm/dpu: test DPU_PINGPONG_DSC bit before assign DSC ops to
    PINGPONG
  drm/msm/dpu: Introduce PINGPONG_NONE to disconnect DSC from PINGPONG
  drm/msm/dpu: add support for DSC encoder v1.2 engine
  drm/msm/dpu: separate DSC flush update out of interface
  drm/msm/dpu: tear down DSC data path when DSC disabled

 drivers/gpu/drm/msm/Makefile                       |   1 +
 .../drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h    |   7 +
 .../drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h    |  11 +
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h |  14 +
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h |   7 +
 .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h   |  16 +
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h |  14 +
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h |  14 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        |  59 +++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     |  31 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |  36 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c         |  29 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h         |  10 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c         |  14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h         |  15 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c     | 382 +++++++++++++++++++++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h        |   3 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c    |   6 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             |   7 +-
 19 files changed, 649 insertions(+), 27 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c

Comments

Abhinav Kumar May 12, 2023, 6:47 p.m. UTC | #1
On 5/12/2023 11:21 AM, Dmitry Baryshkov wrote:
> On 12/05/2023 21:00, Kuogee Hsieh wrote:
>> Current DSC flush update is piggyback inside dpu_hw_ctl_intf_cfg_v1().
>> This patch separates DSC flush away from dpu_hw_ctl_intf_cfg_v1() by
>> adding dpu_hw_ctl_update_pending_flush_dsc_v1() to handle both per
>> DSC engine and DSC flush bits at same time to make it consistent with
>> the location of flush programming of other dpu sub blocks.
>>
>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> ---
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 14 ++++++++++++--
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c  | 22 ++++++++++++++++------
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h  | 10 ++++++++++
>>   3 files changed, 38 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> index ffa6f04..5cae70e 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> @@ -1834,12 +1834,18 @@ dpu_encoder_dsc_initial_line_calc(struct 
>> drm_dsc_config *dsc,
>>       return DIV_ROUND_UP(total_pixels, dsc->slice_width);
>>   }
>> -static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
>> +static void dpu_encoder_dsc_pipe_cfg(struct dpu_encoder_virt *dpu_enc,
>> +                     struct dpu_hw_dsc *hw_dsc,
>>                        struct dpu_hw_pingpong *hw_pp,
>>                        struct drm_dsc_config *dsc,
>>                        u32 common_mode,
>>                        u32 initial_lines)
>>   {
>> +    struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
>> +    struct dpu_hw_ctl *ctl;
>> +
>> +    ctl = cur_master->hw_ctl;
> 
> Just for my understanding: if we have a bonded DSI @ sdm845, should both 
> flashes go to the master CTL or each flush should go to the 
> corresponding CTL?
> 

Is this question for DSC or just general question about flush?

I dont see an explicit DSC flush needed in sdm845 at the ctl level.

If the question is about general flush involving two control paths, we 
need to combine the flushes and they goto the master only. Please refer 
to below part in sde_encoder.c

4243 	/* for split flush, combine pending flush masks and send to master */
4244 	if (pending_flush.pending_flush_mask && sde_enc->cur_master) {
4245 		ctl = sde_enc->cur_master->hw_ctl;
4246 		if (config_changed && ctl->ops.reg_dma_flush)
4247 			ctl->ops.reg_dma_flush(ctl, is_regdma_blocking);
4248 		_sde_encoder_trigger_flush(&sde_enc->base, sde_enc->cur_master,
4249 						&pending_flush,
4250 						config_changed);
4251 	}


> I'm going to send patches that utilize single CTL for sm8150+ after the 
> DSC lands, so I'd like to understand this part.
> 
>> +
>>       if (hw_dsc->ops.dsc_config)
>>           hw_dsc->ops.dsc_config(hw_dsc, dsc, common_mode, 
>> initial_lines);
>> @@ -1854,6 +1860,9 @@ static void dpu_encoder_dsc_pipe_cfg(struct 
>> dpu_hw_dsc *hw_dsc,
>>       if (hw_pp->ops.enable_dsc)
>>           hw_pp->ops.enable_dsc(hw_pp);
>> +
>> +    if (ctl->ops.update_pending_flush_dsc)
>> +        ctl->ops.update_pending_flush_dsc(ctl, hw_dsc->idx);
>>   }
>>   static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
>> @@ -1898,7 +1907,8 @@ static void dpu_encoder_prep_dsc(struct 
>> dpu_encoder_virt *dpu_enc,
>>       initial_lines = dpu_encoder_dsc_initial_line_calc(dsc, enc_ip_w);
>>       for (i = 0; i < MAX_CHANNELS_PER_ENC; i++)
>> -        dpu_encoder_dsc_pipe_cfg(hw_dsc[i], hw_pp[i], dsc, 
>> dsc_common_mode, initial_lines);
>> +        dpu_encoder_dsc_pipe_cfg(dpu_enc, hw_dsc[i], hw_pp[i], dsc,
>> +                     dsc_common_mode, initial_lines);
>>   }
>>   void dpu_encoder_prepare_for_kickoff(struct drm_encoder *drm_enc)
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c 
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
>> index 4f7cfa9..f3a50cc 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
>> @@ -139,6 +139,11 @@ static inline void 
>> dpu_hw_ctl_trigger_flush_v1(struct dpu_hw_ctl *ctx)
>>                   CTL_DSPP_n_FLUSH(dspp - DSPP_0),
>>                   ctx->pending_dspp_flush_mask[dspp - DSPP_0]);
>>           }
>> +
>> +    if (ctx->pending_flush_mask & BIT(DSC_IDX))
>> +        DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH,
>> +                  ctx->pending_dsc_flush_mask);
>> +
>>       DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, ctx->pending_flush_mask);
>>   }
>> @@ -285,6 +290,13 @@ static void 
>> dpu_hw_ctl_update_pending_flush_merge_3d_v1(struct dpu_hw_ctl *ctx,
>>       ctx->pending_flush_mask |= BIT(MERGE_3D_IDX);
>>   }
>> +static void dpu_hw_ctl_update_pending_flush_dsc_v1(struct dpu_hw_ctl 
>> *ctx,
>> +                           enum dpu_dsc dsc_num)
>> +{
>> +    ctx->pending_dsc_flush_mask |= BIT(dsc_num - DSC_0);
>> +    ctx->pending_flush_mask |= BIT(DSC_IDX);
>> +}
>> +
>>   static void dpu_hw_ctl_update_pending_flush_dspp(struct dpu_hw_ctl 
>> *ctx,
>>       enum dpu_dspp dspp, u32 dspp_sub_blk)
>>   {
>> @@ -502,9 +514,6 @@ static void dpu_hw_ctl_intf_cfg_v1(struct 
>> dpu_hw_ctl *ctx,
>>       if ((test_bit(DPU_CTL_VM_CFG, &ctx->caps->features)))
>>           mode_sel = CTL_DEFAULT_GROUP_ID  << 28;
>> -    if (cfg->dsc)
>> -        DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH, cfg->dsc);
>> -
>>       if (cfg->intf_mode_sel == DPU_CTL_MODE_SEL_CMD)
>>           mode_sel |= BIT(17);
>> @@ -524,10 +533,8 @@ static void dpu_hw_ctl_intf_cfg_v1(struct 
>> dpu_hw_ctl *ctx,
>>       if (cfg->merge_3d)
>>           DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE,
>>                     BIT(cfg->merge_3d - MERGE_3D_0));
>> -    if (cfg->dsc) {
>> -        DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, DSC_IDX);
>> +    if (cfg->dsc)
>>           DPU_REG_WRITE(c, CTL_DSC_ACTIVE, cfg->dsc);
>> -    }
>>   }
>>   static void dpu_hw_ctl_intf_cfg(struct dpu_hw_ctl *ctx,
>> @@ -630,6 +637,9 @@ static void _setup_ctl_ops(struct dpu_hw_ctl_ops 
>> *ops,
>>           ops->update_pending_flush_merge_3d =
>>               dpu_hw_ctl_update_pending_flush_merge_3d_v1;
>>           ops->update_pending_flush_wb = 
>> dpu_hw_ctl_update_pending_flush_wb_v1;
>> +
>> +        ops->update_pending_flush_dsc =
>> +            dpu_hw_ctl_update_pending_flush_dsc_v1;
>>       } else {
>>           ops->trigger_flush = dpu_hw_ctl_trigger_flush;
>>           ops->setup_intf_cfg = dpu_hw_ctl_intf_cfg;
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h 
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
>> index 6292002..d4869a0 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
>> @@ -158,6 +158,15 @@ struct dpu_hw_ctl_ops {
>>           enum dpu_dspp blk, u32 dspp_sub_blk);
>>       /**
>> +     * OR in the given flushbits to the cached pending_(dsc_)flush_mask
>> +     * No effect on hardware
>> +     * @ctx       : ctl path ctx pointer
>> +     * @blk       : interface block index
>> +     */
>> +    void (*update_pending_flush_dsc)(struct dpu_hw_ctl *ctx,
>> +                     enum dpu_dsc blk);
>> +
>> +    /**
>>        * Write the value of the pending_flush_mask to hardware
>>        * @ctx       : ctl path ctx pointer
>>        */
>> @@ -245,6 +254,7 @@ struct dpu_hw_ctl {
>>       u32 pending_wb_flush_mask;
>>       u32 pending_merge_3d_flush_mask;
>>       u32 pending_dspp_flush_mask[DSPP_MAX - DSPP_0];
>> +    u32 pending_dsc_flush_mask;
>>       /* ops */
>>       struct dpu_hw_ctl_ops ops;
>
Abhinav Kumar May 12, 2023, 7:05 p.m. UTC | #2
On 5/12/2023 11:50 AM, Dmitry Baryshkov wrote:
> On 12/05/2023 21:47, Abhinav Kumar wrote:
>>
>>
>> On 5/12/2023 11:21 AM, Dmitry Baryshkov wrote:
>>> On 12/05/2023 21:00, Kuogee Hsieh wrote:
>>>> Current DSC flush update is piggyback inside dpu_hw_ctl_intf_cfg_v1().
>>>> This patch separates DSC flush away from dpu_hw_ctl_intf_cfg_v1() by
>>>> adding dpu_hw_ctl_update_pending_flush_dsc_v1() to handle both per
>>>> DSC engine and DSC flush bits at same time to make it consistent with
>>>> the location of flush programming of other dpu sub blocks.
>>>>
>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>>> ---
>>>>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 14 ++++++++++++--
>>>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c  | 22 
>>>> ++++++++++++++++------
>>>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h  | 10 ++++++++++
>>>>   3 files changed, 38 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
>>>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> index ffa6f04..5cae70e 100644
>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> @@ -1834,12 +1834,18 @@ dpu_encoder_dsc_initial_line_calc(struct 
>>>> drm_dsc_config *dsc,
>>>>       return DIV_ROUND_UP(total_pixels, dsc->slice_width);
>>>>   }
>>>> -static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
>>>> +static void dpu_encoder_dsc_pipe_cfg(struct dpu_encoder_virt *dpu_enc,
>>>> +                     struct dpu_hw_dsc *hw_dsc,
>>>>                        struct dpu_hw_pingpong *hw_pp,
>>>>                        struct drm_dsc_config *dsc,
>>>>                        u32 common_mode,
>>>>                        u32 initial_lines)
>>>>   {
>>>> +    struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
>>>> +    struct dpu_hw_ctl *ctl;
>>>> +
>>>> +    ctl = cur_master->hw_ctl;
>>>
>>> Just for my understanding: if we have a bonded DSI @ sdm845, should 
>>> both flashes go to the master CTL or each flush should go to the 
>>> corresponding CTL?
>>>
>>
>> Is this question for DSC or just general question about flush?
>>
>> I dont see an explicit DSC flush needed in sdm845 at the ctl level.
>>
>> If the question is about general flush involving two control paths, we 
>> need to combine the flushes and they goto the master only. Please 
>> refer to below part in sde_encoder.c
> And this is because we have a single CTL to flush on sm8150+, isn't it?
> 

For sm8150+, yes there will be only a single CTL to flush even in bonded 
DSI mode so only one will be flushed.

So, in general, you can refer to the function 
sde_encoder_phys_needs_single_flush() to decide if it needs 2 flushes or 
one. That accounts for the DPU rev as well.

>>
>> 4243     /* for split flush, combine pending flush masks and send to 
>> master */
>> 4244     if (pending_flush.pending_flush_mask && sde_enc->cur_master) {
>> 4245         ctl = sde_enc->cur_master->hw_ctl;
>> 4246         if (config_changed && ctl->ops.reg_dma_flush)
>> 4247             ctl->ops.reg_dma_flush(ctl, is_regdma_blocking);
>> 4248         _sde_encoder_trigger_flush(&sde_enc->base, 
>> sde_enc->cur_master,
>> 4249                         &pending_flush,
>> 4250                         config_changed);
>> 4251     }
> 
>
Marijn Suijten May 13, 2023, 7:26 p.m. UTC | #3
On 2023-05-12 11:00:17, Kuogee Hsieh wrote:
> 
> DPU < 7.0.0 requires the PINGPONG block to be involved during
> DSC setting up. Since DPU >= 7.0.0, enabling and starting the DSC
> encoder engine was moved to INTF with the help of the flush mechanism.
> Add a DPU_PINGPONG_DSC feature bit to restrict the availability of
> dpu_hw_pp_setup_dsc() and dpu_hw_pp_dsc_{enable,disable}() on the
> PINGPONG block to DPU < 7.0.0 hardware, as the registers are not
> available [in the PINGPONG block] on DPU 7.0.0 and higher anymore.

Fwiw I added the brackets in the suggestion as an "up to you to include
this or not".  Drop the brackets if you think this should be part of the
sentence.

> Add DPU_PINGPONG_DSC to PINGPONG_SDM845_MASK, PINGPONG_SDM845_TE2_MASK
> and PINGPONG_SM8150_MASK which is used for all DPU < 7.0 chipsets.
> 
> changes in v6:
> -- split patches and rearrange to keep catalog related files at this patch
> 
> changes in v7:
> -- rewording commit text as suggested at review comments
> 
> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 6 +++---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 4 +++-
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> index 82b58c6..78e4bf6 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> @@ -76,13 +76,13 @@
>  	(BIT(DPU_DIM_LAYER) | BIT(DPU_MIXER_COMBINED_ALPHA))
>  
>  #define PINGPONG_SDM845_MASK \
> -	(BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_TE))
> +	(BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_TE) | BIT(DPU_PINGPONG_DSC))
>  
>  #define PINGPONG_SDM845_TE2_MASK \
> -	(PINGPONG_SDM845_MASK | BIT(DPU_PINGPONG_TE2))
> +	(PINGPONG_SDM845_MASK | BIT(DPU_PINGPONG_TE2) | BIT(DPU_PINGPONG_DSC))

Don't add it here, this is already in PINGPONG_SDM845_MASK.

>  
>  #define PINGPONG_SM8150_MASK \
> -	(BIT(DPU_PINGPONG_DITHER))
> +	(BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_DSC))
>  
>  #define CTL_SC7280_MASK \
>  	(BIT(DPU_CTL_ACTIVE_CFG) | \
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> index 6ee48f0..dc0a4da 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> @@ -144,7 +144,8 @@ enum {
>   * @DPU_PINGPONG_TE2        Additional tear check block for split pipes
>   * @DPU_PINGPONG_SPLIT      PP block supports split fifo
>   * @DPU_PINGPONG_SLAVE      PP block is a suitable slave for split fifo
> - * @DPU_PINGPONG_DITHER,    Dither blocks
> + * @DPU_PINGPONG_DITHER     Dither blocks
> + * @DPU_PINGPONG_DSC        PP ops functions required for DSC

Following the other documentation wording:

    PP block supports DSC

Or:

    PP block has DSC enable/disable registers

- Marijn

>   * @DPU_PINGPONG_MAX
>   */
>  enum {
> @@ -153,6 +154,7 @@ enum {
>  	DPU_PINGPONG_SPLIT,
>  	DPU_PINGPONG_SLAVE,
>  	DPU_PINGPONG_DITHER,
> +	DPU_PINGPONG_DSC,
>  	DPU_PINGPONG_MAX
>  };
>  
> -- 
> 2.7.4
>
Marijn Suijten May 14, 2023, 9:31 p.m. UTC | #4
Asked this before: change the title to "DPU support" (capital "DPU",
singular "support") if this series keeps being resent.

On 2023-05-12 11:00:15, Kuogee Hsieh wrote:
> 
> This series adds the DPU side changes to support DSC 1.2 encoder. This
> was validated with both DSI DSC 1.2 panel and DP DSC 1.2 monitor.
> The DSI and DP parts will be pushed later on top of this change.
> This seriel is rebase on [1], [2] and catalog fixes from rev-4 of [3].

series*

rebased*

Also I think it's not just the catalog fixes but everything now, because
we were both touching HW block implementations?

- Marijn

> [1]: https://patchwork.freedesktop.org/series/116851/
> [2]: https://patchwork.freedesktop.org/series/116615/
> [3]: https://patchwork.freedesktop.org/series/112332/
> 
> Abhinav Kumar (2):
>   drm/msm/dpu: add dsc blocks for remaining chipsets in catalog
>   drm/msm/dpu: add DSC 1.2 hw blocks for relevant chipsets
> 
> Kuogee Hsieh (6):
>   drm/msm/dpu: add DPU_PINGPONG_DSC feature bit for DPU < 7.0.0
>   drm/msm/dpu: test DPU_PINGPONG_DSC bit before assign DSC ops to
>     PINGPONG
>   drm/msm/dpu: Introduce PINGPONG_NONE to disconnect DSC from PINGPONG
>   drm/msm/dpu: add support for DSC encoder v1.2 engine
>   drm/msm/dpu: separate DSC flush update out of interface
>   drm/msm/dpu: tear down DSC data path when DSC disabled
> 
>  drivers/gpu/drm/msm/Makefile                       |   1 +
>  .../drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h    |   7 +
>  .../drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h    |  11 +
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h |  14 +
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h |   7 +
>  .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h   |  16 +
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h |  14 +
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h |  14 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        |  59 +++-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     |  31 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |  36 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c         |  29 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h         |  10 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c         |  14 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h         |  15 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c     | 382 +++++++++++++++++++++
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h        |   3 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c    |   6 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             |   7 +-
>  19 files changed, 649 insertions(+), 27 deletions(-)
>  create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
> 
> -- 
> 2.7.4%%
>
Marijn Suijten May 14, 2023, 9:43 p.m. UTC | #5
Fix the title: "Require DPU_PINGPONG_DSC bit for DSC ops on PINGPONG block"
Or: "Only enable PINGPONG DSC ops with DPU_PINGPONG_DSC bit"
Or (my preference): "Guard PINGPONG DSC ops behind DPU_PINGPONG_DSC bit"

On 2023-05-12 11:00:18, Kuogee Hsieh wrote:
> 
> DPU < 7.0.0 has DPU_PINGPONG_DSC feature bit set to indicate it requires
> both dpu_hw_pp_setup_dsc() and dpu_hw_pp_dsc_{enable,disable}() to be
> executed to complete DSC configuration if DSC hardware block is present.
> Hence test DPU_PINGPONG_DSC feature bit and assign DSC related functions
> to the ops of PINGPONG block accordingly if DPU_PINGPONG_DSC bit is set.
> 
> changes in v6:
> -- split patches, this patch has function handles DPU_PINGPONG_DSC bit
> 
> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c
> index 79e4576..e7f47a4 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c
> @@ -295,6 +295,12 @@ static void _setup_pingpong_ops(struct dpu_hw_pingpong *c,
>  	c->ops.enable_dsc = dpu_hw_pp_dsc_enable;
>  	c->ops.disable_dsc = dpu_hw_pp_dsc_disable;

Nak: the functions are unconditionally assigned right above, while this
bit of code should have been wrapped/replaced by the addition below.
Bad rebase?

- Marijn

> +	if (test_bit(DPU_PINGPONG_DSC, &features)) {
> +		c->ops.setup_dsc = dpu_hw_pp_setup_dsc;
> +		c->ops.enable_dsc = dpu_hw_pp_dsc_enable;
> +		c->ops.disable_dsc = dpu_hw_pp_dsc_disable;
> +	}
> +
>  	if (test_bit(DPU_PINGPONG_DITHER, &features))
>  		c->ops.setup_dither = dpu_hw_pp_setup_dither;
>  };
> -- 
> 2.7.4
>
Marijn Suijten May 15, 2023, 9:21 p.m. UTC | #6
On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
> 
> From: Abhinav Kumar <quic_abhinavk@quicinc.com>
> 
> Add DSC 1.2 hardware blocks to the catalog with necessary sub-block and
> feature flag information.  Each display compression engine (DCE) contains
> dual hard slice DSC encoders so both share same base address but with
> its own different sub block address.

Can we have an explanation of hard vs soft slices in some commit message
and/or code documentation?

> 
> changes in v4:
> -- delete DPU_DSC_HW_REV_1_1
> -- re arrange sc8280xp_dsc[]
> 
> changes in v4:
> -- fix checkpatch warning
> 
> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 14 ++++++++++++
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h |  7 ++++++
>  .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h   | 16 ++++++++++++++
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 14 ++++++++++++
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 14 ++++++++++++
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     | 25 +++++++++++++++++++++-
>  6 files changed, 89 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
> index 500cfd0..c4c93c8 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
> @@ -153,6 +153,18 @@ static const struct dpu_merge_3d_cfg sm8350_merge_3d[] = {
>  	MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
>  };
>  
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +static const struct dpu_dsc_cfg sm8350_dsc[] = {
> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),

Downstream says that the size is 0x10 (and 0x100 for the enc sblk, 0x10
for the ctl sblk).  This simply fills it up to the start of the enc sblk
so that we can see all registers in the dump?  After all only
DSC_CMN_MAIN_CNF is defined in the main register space, so 0x10 is
adequate.

> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),

Should we add an extra suffix to the name to indicate which hard-slice
DSC encoder it is?  i.e. "dce_0_0" and "dce_0_1" etc?

> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),

See comment below about loose BIT() in features.

> +};
> +
>  static const struct dpu_intf_cfg sm8350_intf[] = {
>  	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>  			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> @@ -215,6 +227,8 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
>  	.dspp = sm8350_dspp,
>  	.pingpong_count = ARRAY_SIZE(sm8350_pp),
>  	.pingpong = sm8350_pp,
> +	.dsc = sm8350_dsc,
> +	.dsc_count = ARRAY_SIZE(sm8350_dsc),

Count goes first **everywhere else**, let's not break consistency here.

>  	.merge_3d_count = ARRAY_SIZE(sm8350_merge_3d),
>  	.merge_3d = sm8350_merge_3d,
>  	.intf_count = ARRAY_SIZE(sm8350_intf),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
> index 5646713..42c66fe 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
> @@ -93,6 +93,11 @@ static const struct dpu_pingpong_cfg sc7280_pp[] = {
>  	PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk, -1, -1),
>  };
>  
> +/* NOTE: sc7280 only has one dsc hard slice encoder */

DSC

> +static const struct dpu_dsc_cfg sc7280_dsc[] = {
> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> +};
> +
>  static const struct dpu_intf_cfg sc7280_intf[] = {
>  	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>  			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> @@ -149,6 +154,8 @@ const struct dpu_mdss_cfg dpu_sc7280_cfg = {
>  	.mixer = sc7280_lm,
>  	.pingpong_count = ARRAY_SIZE(sc7280_pp),
>  	.pingpong = sc7280_pp,
> +	.dsc_count = ARRAY_SIZE(sc7280_dsc),
> +	.dsc = sc7280_dsc,
>  	.intf_count = ARRAY_SIZE(sc7280_intf),
>  	.intf = sc7280_intf,
>  	.vbif_count = ARRAY_SIZE(sdm845_vbif),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
> index 808aacd..1901fff 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
> @@ -141,6 +141,20 @@ static const struct dpu_merge_3d_cfg sc8280xp_merge_3d[] = {
>  	MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
>  };
>  
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +static const struct dpu_dsc_cfg sc8280xp_dsc[] = {
> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
> +	DSC_BLK_1_2("dce_2", DSC_4, 0x82000, 0x100, 0, dsc_sblk_0),
> +	DSC_BLK_1_2("dce_2", DSC_5, 0x82000, 0x100, 0, dsc_sblk_1),
> +};
> +
>  /* TODO: INTF 3, 8 and 7 are used for MST, marked as INTF_NONE for now */
>  static const struct dpu_intf_cfg sc8280xp_intf[] = {
>  	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
> @@ -216,6 +230,8 @@ const struct dpu_mdss_cfg dpu_sc8280xp_cfg = {
>  	.dspp = sc8280xp_dspp,
>  	.pingpong_count = ARRAY_SIZE(sc8280xp_pp),
>  	.pingpong = sc8280xp_pp,
> +	.dsc = sc8280xp_dsc,
> +	.dsc_count = ARRAY_SIZE(sc8280xp_dsc),

Swap the two lines.

>  	.merge_3d_count = ARRAY_SIZE(sc8280xp_merge_3d),
>  	.merge_3d = sc8280xp_merge_3d,
>  	.intf_count = ARRAY_SIZE(sc8280xp_intf),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
> index 1a89ff9..741d03f 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
> @@ -161,6 +161,18 @@ static const struct dpu_merge_3d_cfg sm8450_merge_3d[] = {
>  	MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x65f00),
>  };
>  
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +static const struct dpu_dsc_cfg sm8450_dsc[] = {
> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
> +};
> +
>  static const struct dpu_intf_cfg sm8450_intf[] = {
>  	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>  			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> @@ -223,6 +235,8 @@ const struct dpu_mdss_cfg dpu_sm8450_cfg = {
>  	.dspp = sm8450_dspp,
>  	.pingpong_count = ARRAY_SIZE(sm8450_pp),
>  	.pingpong = sm8450_pp,
> +	.dsc = sm8450_dsc,
> +	.dsc_count = ARRAY_SIZE(sm8450_dsc),

Another swap.

>  	.merge_3d_count = ARRAY_SIZE(sm8450_merge_3d),
>  	.merge_3d = sm8450_merge_3d,
>  	.intf_count = ARRAY_SIZE(sm8450_intf),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
> index 497b34c..3ee6dc8 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
> @@ -165,6 +165,18 @@ static const struct dpu_merge_3d_cfg sm8550_merge_3d[] = {
>  	MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x66700),
>  };
>  
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +static const struct dpu_dsc_cfg sm8550_dsc[] = {
> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
> +};
> +
>  static const struct dpu_intf_cfg sm8550_intf[] = {
>  	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>  			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> @@ -227,6 +239,8 @@ const struct dpu_mdss_cfg dpu_sm8550_cfg = {
>  	.dspp = sm8550_dspp,
>  	.pingpong_count = ARRAY_SIZE(sm8550_pp),
>  	.pingpong = sm8550_pp,
> +	.dsc = sm8550_dsc,
> +	.dsc_count = ARRAY_SIZE(sm8550_dsc),

Swap.

>  	.merge_3d_count = ARRAY_SIZE(sm8550_merge_3d),
>  	.merge_3d = sm8550_merge_3d,
>  	.intf_count = ARRAY_SIZE(sm8550_intf),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> index 78e4bf6..c1d7338 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> @@ -1,6 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
>   */
>  
>  #define pr_fmt(fmt)	"[drm:%s:%d] " fmt, __func__, __LINE__
> @@ -522,6 +522,16 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
>  /*************************************************************
>   * DSC sub blocks config
>   *************************************************************/
> +static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
> +	.enc = {.base = 0x100, .len = 0x100},
> +	.ctl = {.base = 0xF00, .len = 0x10},
> +};
> +
> +static const struct dpu_dsc_sub_blks dsc_sblk_1 = {
> +	.enc = {.base = 0x200, .len = 0x100},
> +	.ctl = {.base = 0xF80, .len = 0x10},
> +};
> +
>  #define DSC_BLK(_name, _id, _base, _features) \
>  	{\
>  	.name = _name, .id = _id, \
> @@ -529,6 +539,19 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
>  	.features = _features, \
>  	}
>  
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +#define DSC_BLK_1_2(_name, _id, _base, _len, _features, _sblk) \

There are no address values here so this comment doesn't seem very
useful, and it is already duplicated on every DSC block array, where the
duplication is more visible.  Drop the comment here?

> +	{\
> +	.name = _name, .id = _id, \
> +	.base = _base, .len = _len, \

The len is always 0x100 (downstream says 0x10), should we hardcode it
here and drop _len?  We can always add it back if a future revision
starts changing it, but that's not the case currently.

> +	.features = BIT(DPU_DSC_HW_REV_1_2) | _features, \

We don't willy-nilly append bits like that: should there be global
feature flags?

Or is this the start of a new era where we expand those defines in-line
and drop them altogether?  I'd much prefer that but we should first
align on this direction (and then also make the switch globally in a
followup).

- Marijn

> +	.sblk = &_sblk, \
> +	}
> +
>  /*************************************************************
>   * INTF sub blocks config
>   *************************************************************/
> -- 
> 2.7.4
>
Marijn Suijten May 15, 2023, 9:24 p.m. UTC | #7
By the way, can we replace "relevant chipsets" in the title with
"DPU >= 7.0" like the other titles?

- Marijn

On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
<snip>
Marijn Suijten May 15, 2023, 9:52 p.m. UTC | #8
On 2023-05-12 11:00:21, Kuogee Hsieh wrote:
> 
> Current DSC flush update is piggyback inside dpu_hw_ctl_intf_cfg_v1().

Can you rewrite "is piggyback"?  Something like "Currently DSC flushing
happens during interface configuration".  And it's intf configuration
**on the CTL**, which makes this extra confusing.

> This patch separates DSC flush away from dpu_hw_ctl_intf_cfg_v1() by

Drop "This patch".  Then, separates -> Separate

> adding dpu_hw_ctl_update_pending_flush_dsc_v1() to handle both per
> DSC engine and DSC flush bits at same time to make it consistent with

Make that per-DSC with a hyphen.

> the location of flush programming of other dpu sub blocks.

DPU sub-blocks.

> 
> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 14 ++++++++++++--
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c  | 22 ++++++++++++++++------
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h  | 10 ++++++++++
>  3 files changed, 38 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> index ffa6f04..5cae70e 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> @@ -1834,12 +1834,18 @@ dpu_encoder_dsc_initial_line_calc(struct drm_dsc_config *dsc,
>  	return DIV_ROUND_UP(total_pixels, dsc->slice_width);
>  }
>  
> -static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
> +static void dpu_encoder_dsc_pipe_cfg(struct dpu_encoder_virt *dpu_enc,

Why not pass hw_ctl directly?  The other blocks are directly passed as
well, and the caller already has cur_master.  Otherwise we might as well
inline the for loops.  Same question for the new _clr call added in
patch 8/8.

> +				     struct dpu_hw_dsc *hw_dsc,
>  				     struct dpu_hw_pingpong *hw_pp,
>  				     struct drm_dsc_config *dsc,
>  				     u32 common_mode,
>  				     u32 initial_lines)
>  {
> +	struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
> +	struct dpu_hw_ctl *ctl;
> +
> +	ctl = cur_master->hw_ctl;

Assign this directly at declaration, just like cur_master (but
irrelevant if you pass this directly instead).

> +
>  	if (hw_dsc->ops.dsc_config)
>  		hw_dsc->ops.dsc_config(hw_dsc, dsc, common_mode, initial_lines);
>  
> @@ -1854,6 +1860,9 @@ static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
>  
>  	if (hw_pp->ops.enable_dsc)
>  		hw_pp->ops.enable_dsc(hw_pp);
> +
> +	if (ctl->ops.update_pending_flush_dsc)
> +		ctl->ops.update_pending_flush_dsc(ctl, hw_dsc->idx);
>  }
>  
>  static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
> @@ -1898,7 +1907,8 @@ static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
>  	initial_lines = dpu_encoder_dsc_initial_line_calc(dsc, enc_ip_w);
>  
>  	for (i = 0; i < MAX_CHANNELS_PER_ENC; i++)
> -		dpu_encoder_dsc_pipe_cfg(hw_dsc[i], hw_pp[i], dsc, dsc_common_mode, initial_lines);
> +		dpu_encoder_dsc_pipe_cfg(dpu_enc, hw_dsc[i], hw_pp[i], dsc,
> +					 dsc_common_mode, initial_lines);
>  }
>  
>  void dpu_encoder_prepare_for_kickoff(struct drm_encoder *drm_enc)
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
> index 4f7cfa9..f3a50cc 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
> @@ -139,6 +139,11 @@ static inline void dpu_hw_ctl_trigger_flush_v1(struct dpu_hw_ctl *ctx)
>  				CTL_DSPP_n_FLUSH(dspp - DSPP_0),
>  				ctx->pending_dspp_flush_mask[dspp - DSPP_0]);
>  		}
> +
> +	if (ctx->pending_flush_mask & BIT(DSC_IDX))
> +		DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH,
> +			      ctx->pending_dsc_flush_mask);

When are we setting this to zero again?

Same question for the other masks, only the global pending_flush_mask
and pending_dspp_flush_mask are reset in dpu_hw_ctl_clear_pending_flush.

> +
>  	DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, ctx->pending_flush_mask);
>  }
>  
> @@ -285,6 +290,13 @@ static void dpu_hw_ctl_update_pending_flush_merge_3d_v1(struct dpu_hw_ctl *ctx,
>  	ctx->pending_flush_mask |= BIT(MERGE_3D_IDX);
>  }
>  
> +static void dpu_hw_ctl_update_pending_flush_dsc_v1(struct dpu_hw_ctl *ctx,
> +						   enum dpu_dsc dsc_num)
> +{
> +	ctx->pending_dsc_flush_mask |= BIT(dsc_num - DSC_0);
> +	ctx->pending_flush_mask |= BIT(DSC_IDX);
> +}
> +
>  static void dpu_hw_ctl_update_pending_flush_dspp(struct dpu_hw_ctl *ctx,
>  	enum dpu_dspp dspp, u32 dspp_sub_blk)
>  {
> @@ -502,9 +514,6 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
>  	if ((test_bit(DPU_CTL_VM_CFG, &ctx->caps->features)))
>  		mode_sel = CTL_DEFAULT_GROUP_ID  << 28;
>  
> -	if (cfg->dsc)
> -		DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH, cfg->dsc);
> -
>  	if (cfg->intf_mode_sel == DPU_CTL_MODE_SEL_CMD)
>  		mode_sel |= BIT(17);
>  
> @@ -524,10 +533,8 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
>  	if (cfg->merge_3d)
>  		DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE,
>  			      BIT(cfg->merge_3d - MERGE_3D_0));

Can we have a newline here?

> -	if (cfg->dsc) {
> -		DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, DSC_IDX);

Found the reason why this patch (as one of the few) is needed to get
display working on my SM8150/SM8250 devices: the semantic change is that
BIT() was missing around DSC_IDX here.
(It wasn't hampering SDM845 because it doesn't have a configurable
 crossbar, i.e. DPU_CTL_ACTIVE_CFG)

Manually reverting this patch and adding BIT() indeed also fixes the
issue.

This semantic change should be documented in the description and with a
Fixes: (and Reported-by:?), or as a separate preliminary patch for
clarity.

> +	if (cfg->dsc)
>  		DPU_REG_WRITE(c, CTL_DSC_ACTIVE, cfg->dsc);
> -	}
>  }
>  
>  static void dpu_hw_ctl_intf_cfg(struct dpu_hw_ctl *ctx,
> @@ -630,6 +637,9 @@ static void _setup_ctl_ops(struct dpu_hw_ctl_ops *ops,
>  		ops->update_pending_flush_merge_3d =
>  			dpu_hw_ctl_update_pending_flush_merge_3d_v1;
>  		ops->update_pending_flush_wb = dpu_hw_ctl_update_pending_flush_wb_v1;
> +

And while adding a newline above, drop the one here.

> +		ops->update_pending_flush_dsc =
> +			dpu_hw_ctl_update_pending_flush_dsc_v1;
>  	} else {
>  		ops->trigger_flush = dpu_hw_ctl_trigger_flush;
>  		ops->setup_intf_cfg = dpu_hw_ctl_intf_cfg;
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
> index 6292002..d4869a0 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
> @@ -158,6 +158,15 @@ struct dpu_hw_ctl_ops {
>  		enum dpu_dspp blk, u32 dspp_sub_blk);
>  
>  	/**
> +	 * OR in the given flushbits to the cached pending_(dsc_)flush_mask
> +	 * No effect on hardware
> +	 * @ctx       : ctl path ctx pointer
> +	 * @blk       : interface block index

Can you drop the spaces before the colon (:)?  That's wrong and will be
fixed elsewhere later.

> +	 */
> +	void (*update_pending_flush_dsc)(struct dpu_hw_ctl *ctx,
> +					 enum dpu_dsc blk);

Indent with a single tab to match the rest.

> +
> +	/**
>  	 * Write the value of the pending_flush_mask to hardware
>  	 * @ctx       : ctl path ctx pointer
>  	 */
> @@ -245,6 +254,7 @@ struct dpu_hw_ctl {
>  	u32 pending_wb_flush_mask;
>  	u32 pending_merge_3d_flush_mask;
>  	u32 pending_dspp_flush_mask[DSPP_MAX - DSPP_0];
> +	u32 pending_dsc_flush_mask;

Don't forget to add this to the doc-comment, or did you skip it by
intention because pending_merge_3d_flush_mask and
pending_dspp_flush_mask are missing as well?

- Marijn

>  
>  	/* ops */
>  	struct dpu_hw_ctl_ops ops;
> -- 
> 2.7.4
>
Abhinav Kumar May 15, 2023, 10:03 p.m. UTC | #9
On 5/15/2023 2:21 PM, Marijn Suijten wrote:
> On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
>>
>> From: Abhinav Kumar <quic_abhinavk@quicinc.com>
>>
>> Add DSC 1.2 hardware blocks to the catalog with necessary sub-block and
>> feature flag information.  Each display compression engine (DCE) contains
>> dual hard slice DSC encoders so both share same base address but with
>> its own different sub block address.
> 
> Can we have an explanation of hard vs soft slices in some commit message
> and/or code documentation?
> 

Not in this one. It wont look appropriate. I would rather remove "hard" 
to avoid confusion.

>>
>> changes in v4:
>> -- delete DPU_DSC_HW_REV_1_1
>> -- re arrange sc8280xp_dsc[]
>>
>> changes in v4:
>> -- fix checkpatch warning
>>
>> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> ---
>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 14 ++++++++++++
>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h |  7 ++++++
>>   .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h   | 16 ++++++++++++++
>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 14 ++++++++++++
>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 14 ++++++++++++
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     | 25 +++++++++++++++++++++-
>>   6 files changed, 89 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
>> index 500cfd0..c4c93c8 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
>> @@ -153,6 +153,18 @@ static const struct dpu_merge_3d_cfg sm8350_merge_3d[] = {
>>   	MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
>>   };
>>   
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +static const struct dpu_dsc_cfg sm8350_dsc[] = {
>> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
> 
> Downstream says that the size is 0x10 (and 0x100 for the enc sblk, 0x10
> for the ctl sblk).  This simply fills it up to the start of the enc sblk
> so that we can see all registers in the dump?  After all only
> DSC_CMN_MAIN_CNF is defined in the main register space, so 0x10 is
> adequate.
> 

.len today is always only for the dump. and yes even here we have only 
0x100 for the enc and 0x10 for the ctl.

+static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
+	.enc = {.base = 0x100, .len = 0x100},
+	.ctl = {.base = 0xF00, .len = 0x10},
+};

The issue here is that, the dpu snapshot does not handle sub_blk parsing 
today. Its a to-do item. So for that reason, 0x100 was used here to 
atleast get the full encoder dumps.

>> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
> 
> Should we add an extra suffix to the name to indicate which hard-slice
> DSC encoder it is?  i.e. "dce_0_0" and "dce_0_1" etc?

Ok, that should be fine. We can add it.

> 
>> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
> 

> See comment below about loose BIT() in features.

Responded below.
> 
>> +};
>> +
>>   static const struct dpu_intf_cfg sm8350_intf[] = {
>>   	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>   			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>> @@ -215,6 +227,8 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
>>   	.dspp = sm8350_dspp,
>>   	.pingpong_count = ARRAY_SIZE(sm8350_pp),
>>   	.pingpong = sm8350_pp,
>> +	.dsc = sm8350_dsc,
>> +	.dsc_count = ARRAY_SIZE(sm8350_dsc),
> 
> Count goes first **everywhere else**, let's not break consistency here.
> 

the order of DSC entries is swapped for all chipsets. Please refer to 
dpu_sc8180x_cfg, dpu_sm8250_cfg etc.

So if you are talking about consistency, this is actually consistent 
with whats present in other chipsets.

If you are very particular about this, then once this lands, you can 
change the order for all of them in another change.

Same answer for all swap comments.

>>   	.merge_3d_count = ARRAY_SIZE(sm8350_merge_3d),
>>   	.merge_3d = sm8350_merge_3d,
>>   	.intf_count = ARRAY_SIZE(sm8350_intf),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
>> index 5646713..42c66fe 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
>> @@ -93,6 +93,11 @@ static const struct dpu_pingpong_cfg sc7280_pp[] = {
>>   	PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk, -1, -1),
>>   };
>>   
>> +/* NOTE: sc7280 only has one dsc hard slice encoder */
> 
> DSC
> 
>> +static const struct dpu_dsc_cfg sc7280_dsc[] = {
>> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> +};
>> +
>>   static const struct dpu_intf_cfg sc7280_intf[] = {
>>   	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>   			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>> @@ -149,6 +154,8 @@ const struct dpu_mdss_cfg dpu_sc7280_cfg = {
>>   	.mixer = sc7280_lm,
>>   	.pingpong_count = ARRAY_SIZE(sc7280_pp),
>>   	.pingpong = sc7280_pp,
>> +	.dsc_count = ARRAY_SIZE(sc7280_dsc),
>> +	.dsc = sc7280_dsc,
>>   	.intf_count = ARRAY_SIZE(sc7280_intf),
>>   	.intf = sc7280_intf,
>>   	.vbif_count = ARRAY_SIZE(sdm845_vbif),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
>> index 808aacd..1901fff 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
>> @@ -141,6 +141,20 @@ static const struct dpu_merge_3d_cfg sc8280xp_merge_3d[] = {
>>   	MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
>>   };
>>   
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +static const struct dpu_dsc_cfg sc8280xp_dsc[] = {
>> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>> +	DSC_BLK_1_2("dce_2", DSC_4, 0x82000, 0x100, 0, dsc_sblk_0),
>> +	DSC_BLK_1_2("dce_2", DSC_5, 0x82000, 0x100, 0, dsc_sblk_1),
>> +};
>> +
>>   /* TODO: INTF 3, 8 and 7 are used for MST, marked as INTF_NONE for now */
>>   static const struct dpu_intf_cfg sc8280xp_intf[] = {
>>   	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>> @@ -216,6 +230,8 @@ const struct dpu_mdss_cfg dpu_sc8280xp_cfg = {
>>   	.dspp = sc8280xp_dspp,
>>   	.pingpong_count = ARRAY_SIZE(sc8280xp_pp),
>>   	.pingpong = sc8280xp_pp,
>> +	.dsc = sc8280xp_dsc,
>> +	.dsc_count = ARRAY_SIZE(sc8280xp_dsc),
> 
> Swap the two lines.
> 
>>   	.merge_3d_count = ARRAY_SIZE(sc8280xp_merge_3d),
>>   	.merge_3d = sc8280xp_merge_3d,
>>   	.intf_count = ARRAY_SIZE(sc8280xp_intf),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
>> index 1a89ff9..741d03f 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
>> @@ -161,6 +161,18 @@ static const struct dpu_merge_3d_cfg sm8450_merge_3d[] = {
>>   	MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x65f00),
>>   };
>>   
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +static const struct dpu_dsc_cfg sm8450_dsc[] = {
>> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>> +};
>> +
>>   static const struct dpu_intf_cfg sm8450_intf[] = {
>>   	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>   			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>> @@ -223,6 +235,8 @@ const struct dpu_mdss_cfg dpu_sm8450_cfg = {
>>   	.dspp = sm8450_dspp,
>>   	.pingpong_count = ARRAY_SIZE(sm8450_pp),
>>   	.pingpong = sm8450_pp,
>> +	.dsc = sm8450_dsc,
>> +	.dsc_count = ARRAY_SIZE(sm8450_dsc),
> 
> Another swap.
> 
>>   	.merge_3d_count = ARRAY_SIZE(sm8450_merge_3d),
>>   	.merge_3d = sm8450_merge_3d,
>>   	.intf_count = ARRAY_SIZE(sm8450_intf),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
>> index 497b34c..3ee6dc8 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
>> @@ -165,6 +165,18 @@ static const struct dpu_merge_3d_cfg sm8550_merge_3d[] = {
>>   	MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x66700),
>>   };
>>   
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +static const struct dpu_dsc_cfg sm8550_dsc[] = {
>> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>> +};
>> +
>>   static const struct dpu_intf_cfg sm8550_intf[] = {
>>   	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>   			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>> @@ -227,6 +239,8 @@ const struct dpu_mdss_cfg dpu_sm8550_cfg = {
>>   	.dspp = sm8550_dspp,
>>   	.pingpong_count = ARRAY_SIZE(sm8550_pp),
>>   	.pingpong = sm8550_pp,
>> +	.dsc = sm8550_dsc,
>> +	.dsc_count = ARRAY_SIZE(sm8550_dsc),
> 
> Swap.
> 
>>   	.merge_3d_count = ARRAY_SIZE(sm8550_merge_3d),
>>   	.merge_3d = sm8550_merge_3d,
>>   	.intf_count = ARRAY_SIZE(sm8550_intf),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
>> index 78e4bf6..c1d7338 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
>> @@ -1,6 +1,6 @@
>>   // SPDX-License-Identifier: GPL-2.0-only
>>   /* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
>> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
>> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
>>    */
>>   
>>   #define pr_fmt(fmt)	"[drm:%s:%d] " fmt, __func__, __LINE__
>> @@ -522,6 +522,16 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
>>   /*************************************************************
>>    * DSC sub blocks config
>>    *************************************************************/
>> +static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
>> +	.enc = {.base = 0x100, .len = 0x100},
>> +	.ctl = {.base = 0xF00, .len = 0x10},
>> +};
>> +
>> +static const struct dpu_dsc_sub_blks dsc_sblk_1 = {
>> +	.enc = {.base = 0x200, .len = 0x100},
>> +	.ctl = {.base = 0xF80, .len = 0x10},
>> +};
>> +
>>   #define DSC_BLK(_name, _id, _base, _features) \
>>   	{\
>>   	.name = _name, .id = _id, \
>> @@ -529,6 +539,19 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
>>   	.features = _features, \
>>   	}
>>   
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +#define DSC_BLK_1_2(_name, _id, _base, _len, _features, _sblk) \
> 
> There are no address values here so this comment doesn't seem very
> useful, and it is already duplicated on every DSC block array, where the
> duplication is more visible.  Drop the comment here?
> 

_base is the address. So base address. Does that clarify things?

>> +	{\
>> +	.name = _name, .id = _id, \
>> +	.base = _base, .len = _len, \
> 
> The len is always 0x100 (downstream says 0x10), should we hardcode it
> here and drop _len?  We can always add it back if a future revision
> starts changing it, but that's not the case currently.
> 
>> +	.features = BIT(DPU_DSC_HW_REV_1_2) | _features, \
> 
> We don't willy-nilly append bits like that: should there be global
> feature flags?

So this approach is actually better. This macro is a DSC_1_2 macro so it 
will have the 1.2 feature flag and other features like native_422 
support of that encoder are ORed on top of it. Nothing wrong with this.

> 
> Or is this the start of a new era where we expand those defines in-line
> and drop them altogether?  I'd much prefer that but we should first
> align on this direction (and then also make the switch globally in a
> followup).
> 

Its case by case. No need to generalize.

In this the feature flag ORed with the base feature flag of DSC_1_2 
makes it more clear.

> - Marijn
> 
>> +	.sblk = &_sblk, \
>> +	}
>> +
>>   /*************************************************************
>>    * INTF sub blocks config
>>    *************************************************************/
>> -- 
>> 2.7.4
>>
Abhinav Kumar May 15, 2023, 10:10 p.m. UTC | #10
On 5/15/2023 3:03 PM, Abhinav Kumar wrote:
> 
> 
> On 5/15/2023 2:21 PM, Marijn Suijten wrote:
>> On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
>>>
>>> From: Abhinav Kumar <quic_abhinavk@quicinc.com>
>>>
>>> Add DSC 1.2 hardware blocks to the catalog with necessary sub-block and
>>> feature flag information.  Each display compression engine (DCE) 
>>> contains
>>> dual hard slice DSC encoders so both share same base address but with
>>> its own different sub block address.
>>
>> Can we have an explanation of hard vs soft slices in some commit message
>> and/or code documentation?
>>
> 
> Not in this one. It wont look appropriate. I would rather remove "hard" 
> to avoid confusion.
> 
>>>
>>> changes in v4:
>>> -- delete DPU_DSC_HW_REV_1_1
>>> -- re arrange sc8280xp_dsc[]
>>>
>>> changes in v4:
>>> -- fix checkpatch warning
>>>
>>> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>> ---
>>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 14 ++++++++++++
>>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h |  7 ++++++
>>>   .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h   | 16 ++++++++++++++
>>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 14 ++++++++++++
>>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 14 ++++++++++++
>>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     | 25 
>>> +++++++++++++++++++++-
>>>   6 files changed, 89 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h 
>>> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
>>> index 500cfd0..c4c93c8 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
>>> @@ -153,6 +153,18 @@ static const struct dpu_merge_3d_cfg 
>>> sm8350_merge_3d[] = {
>>>       MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
>>>   };
>>> +/*
>>> + * NOTE: Each display compression engine (DCE) contains dual hard
>>> + * slice DSC encoders so both share same base address but with
>>> + * its own different sub block address.
>>> + */
>>> +static const struct dpu_dsc_cfg sm8350_dsc[] = {
>>> +    DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>>
>> Downstream says that the size is 0x10 (and 0x100 for the enc sblk, 0x10
>> for the ctl sblk).  This simply fills it up to the start of the enc sblk
>> so that we can see all registers in the dump?  After all only
>> DSC_CMN_MAIN_CNF is defined in the main register space, so 0x10 is
>> adequate.
>>
> 
> .len today is always only for the dump. and yes even here we have only 
> 0x100 for the enc and 0x10 for the ctl.
> 
> +static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
> +    .enc = {.base = 0x100, .len = 0x100},
> +    .ctl = {.base = 0xF00, .len = 0x10},
> +};
> 
> The issue here is that, the dpu snapshot does not handle sub_blk parsing 
> today. Its a to-do item. So for that reason, 0x100 was used here to 
> atleast get the full encoder dumps.
> 
>>> +    DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>>
>> Should we add an extra suffix to the name to indicate which hard-slice
>> DSC encoder it is?  i.e. "dce_0_0" and "dce_0_1" etc?
> 
> Ok, that should be fine. We can add it.
> 
>>
>>> +    DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, 
>>> BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>>> +    DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, 
>>> BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>>
> 
>> See comment below about loose BIT() in features.
> 
> Responded below.
>>
>>> +};
>>> +
>>>   static const struct dpu_intf_cfg sm8350_intf[] = {
>>>       INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, 
>>> MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>>               DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>>> @@ -215,6 +227,8 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
>>>       .dspp = sm8350_dspp,
>>>       .pingpong_count = ARRAY_SIZE(sm8350_pp),
>>>       .pingpong = sm8350_pp,
>>> +    .dsc = sm8350_dsc,
>>> +    .dsc_count = ARRAY_SIZE(sm8350_dsc),
>>
>> Count goes first **everywhere else**, let's not break consistency here.
>>
> 
> the order of DSC entries is swapped for all chipsets. Please refer to 
> dpu_sc8180x_cfg, dpu_sm8250_cfg etc.
> 
> So if you are talking about consistency, this is actually consistent 
> with whats present in other chipsets.
> 
> If you are very particular about this, then once this lands, you can 
> change the order for all of them in another change.
> 
> Same answer for all swap comments.

Actually, I am wrong here, I misread the entries. Sure, I can fix this up.

Please ignore this response and you can check the others.

> 
>>>       .merge_3d_count = ARRAY_SIZE(sm8350_merge_3d),
>>>       .merge_3d = sm8350_merge_3d,
>>>       .intf_count = ARRAY_SIZE(sm8350_intf),
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h 
>>> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
>>> index 5646713..42c66fe 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
>>> @@ -93,6 +93,11 @@ static const struct dpu_pingpong_cfg sc7280_pp[] = {
>>>       PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, 
>>> sc7280_pp_sblk, -1, -1),
>>>   };
>>> +/* NOTE: sc7280 only has one dsc hard slice encoder */
>>
>> DSC
>>
>>> +static const struct dpu_dsc_cfg sc7280_dsc[] = {
>>> +    DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 
>>> BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>>> +};
>>> +
>>>   static const struct dpu_intf_cfg sc7280_intf[] = {
>>>       INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, 
>>> MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>>               DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>>> @@ -149,6 +154,8 @@ const struct dpu_mdss_cfg dpu_sc7280_cfg = {
>>>       .mixer = sc7280_lm,
>>>       .pingpong_count = ARRAY_SIZE(sc7280_pp),
>>>       .pingpong = sc7280_pp,
>>> +    .dsc_count = ARRAY_SIZE(sc7280_dsc),
>>> +    .dsc = sc7280_dsc,
>>>       .intf_count = ARRAY_SIZE(sc7280_intf),
>>>       .intf = sc7280_intf,
>>>       .vbif_count = ARRAY_SIZE(sdm845_vbif),
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h 
>>> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
>>> index 808aacd..1901fff 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
>>> @@ -141,6 +141,20 @@ static const struct dpu_merge_3d_cfg 
>>> sc8280xp_merge_3d[] = {
>>>       MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
>>>   };
>>> +/*
>>> + * NOTE: Each display compression engine (DCE) contains dual hard
>>> + * slice DSC encoders so both share same base address but with
>>> + * its own different sub block address.
>>> + */
>>> +static const struct dpu_dsc_cfg sc8280xp_dsc[] = {
>>> +    DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>>> +    DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>>> +    DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, 
>>> BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>>> +    DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, 
>>> BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>>> +    DSC_BLK_1_2("dce_2", DSC_4, 0x82000, 0x100, 0, dsc_sblk_0),
>>> +    DSC_BLK_1_2("dce_2", DSC_5, 0x82000, 0x100, 0, dsc_sblk_1),
>>> +};
>>> +
>>>   /* TODO: INTF 3, 8 and 7 are used for MST, marked as INTF_NONE for 
>>> now */
>>>   static const struct dpu_intf_cfg sc8280xp_intf[] = {
>>>       INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, 
>>> MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>> @@ -216,6 +230,8 @@ const struct dpu_mdss_cfg dpu_sc8280xp_cfg = {
>>>       .dspp = sc8280xp_dspp,
>>>       .pingpong_count = ARRAY_SIZE(sc8280xp_pp),
>>>       .pingpong = sc8280xp_pp,
>>> +    .dsc = sc8280xp_dsc,
>>> +    .dsc_count = ARRAY_SIZE(sc8280xp_dsc),
>>
>> Swap the two lines.
>>
>>>       .merge_3d_count = ARRAY_SIZE(sc8280xp_merge_3d),
>>>       .merge_3d = sc8280xp_merge_3d,
>>>       .intf_count = ARRAY_SIZE(sc8280xp_intf),
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h 
>>> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
>>> index 1a89ff9..741d03f 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
>>> @@ -161,6 +161,18 @@ static const struct dpu_merge_3d_cfg 
>>> sm8450_merge_3d[] = {
>>>       MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x65f00),
>>>   };
>>> +/*
>>> + * NOTE: Each display compression engine (DCE) contains dual hard
>>> + * slice DSC encoders so both share same base address but with
>>> + * its own different sub block address.
>>> + */
>>> +static const struct dpu_dsc_cfg sm8450_dsc[] = {
>>> +    DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>>> +    DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>>> +    DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, 
>>> BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>>> +    DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, 
>>> BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>>> +};
>>> +
>>>   static const struct dpu_intf_cfg sm8450_intf[] = {
>>>       INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, 
>>> MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>>               DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>>> @@ -223,6 +235,8 @@ const struct dpu_mdss_cfg dpu_sm8450_cfg = {
>>>       .dspp = sm8450_dspp,
>>>       .pingpong_count = ARRAY_SIZE(sm8450_pp),
>>>       .pingpong = sm8450_pp,
>>> +    .dsc = sm8450_dsc,
>>> +    .dsc_count = ARRAY_SIZE(sm8450_dsc),
>>
>> Another swap.
>>
>>>       .merge_3d_count = ARRAY_SIZE(sm8450_merge_3d),
>>>       .merge_3d = sm8450_merge_3d,
>>>       .intf_count = ARRAY_SIZE(sm8450_intf),
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h 
>>> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
>>> index 497b34c..3ee6dc8 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
>>> @@ -165,6 +165,18 @@ static const struct dpu_merge_3d_cfg 
>>> sm8550_merge_3d[] = {
>>>       MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x66700),
>>>   };
>>> +/*
>>> + * NOTE: Each display compression engine (DCE) contains dual hard
>>> + * slice DSC encoders so both share same base address but with
>>> + * its own different sub block address.
>>> + */
>>> +static const struct dpu_dsc_cfg sm8550_dsc[] = {
>>> +    DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>>> +    DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>>> +    DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, 
>>> BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>>> +    DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, 
>>> BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>>> +};
>>> +
>>>   static const struct dpu_intf_cfg sm8550_intf[] = {
>>>       INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, 
>>> MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>>               DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>>> @@ -227,6 +239,8 @@ const struct dpu_mdss_cfg dpu_sm8550_cfg = {
>>>       .dspp = sm8550_dspp,
>>>       .pingpong_count = ARRAY_SIZE(sm8550_pp),
>>>       .pingpong = sm8550_pp,
>>> +    .dsc = sm8550_dsc,
>>> +    .dsc_count = ARRAY_SIZE(sm8550_dsc),
>>
>> Swap.
>>
>>>       .merge_3d_count = ARRAY_SIZE(sm8550_merge_3d),
>>>       .merge_3d = sm8550_merge_3d,
>>>       .intf_count = ARRAY_SIZE(sm8550_intf),
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
>>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
>>> index 78e4bf6..c1d7338 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
>>> @@ -1,6 +1,6 @@
>>>   // SPDX-License-Identifier: GPL-2.0-only
>>>   /* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
>>> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights 
>>> reserved.
>>> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All 
>>> rights reserved.
>>>    */
>>>   #define pr_fmt(fmt)    "[drm:%s:%d] " fmt, __func__, __LINE__
>>> @@ -522,6 +522,16 @@ static const struct dpu_pingpong_sub_blks 
>>> sc7280_pp_sblk = {
>>>   /*************************************************************
>>>    * DSC sub blocks config
>>>    *************************************************************/
>>> +static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
>>> +    .enc = {.base = 0x100, .len = 0x100},
>>> +    .ctl = {.base = 0xF00, .len = 0x10},
>>> +};
>>> +
>>> +static const struct dpu_dsc_sub_blks dsc_sblk_1 = {
>>> +    .enc = {.base = 0x200, .len = 0x100},
>>> +    .ctl = {.base = 0xF80, .len = 0x10},
>>> +};
>>> +
>>>   #define DSC_BLK(_name, _id, _base, _features) \
>>>       {\
>>>       .name = _name, .id = _id, \
>>> @@ -529,6 +539,19 @@ static const struct dpu_pingpong_sub_blks 
>>> sc7280_pp_sblk = {
>>>       .features = _features, \
>>>       }
>>> +/*
>>> + * NOTE: Each display compression engine (DCE) contains dual hard
>>> + * slice DSC encoders so both share same base address but with
>>> + * its own different sub block address.
>>> + */
>>> +#define DSC_BLK_1_2(_name, _id, _base, _len, _features, _sblk) \
>>
>> There are no address values here so this comment doesn't seem very
>> useful, and it is already duplicated on every DSC block array, where the
>> duplication is more visible.  Drop the comment here?
>>
> 
> _base is the address. So base address. Does that clarify things?
> 
>>> +    {\
>>> +    .name = _name, .id = _id, \
>>> +    .base = _base, .len = _len, \
>>
>> The len is always 0x100 (downstream says 0x10), should we hardcode it
>> here and drop _len?  We can always add it back if a future revision
>> starts changing it, but that's not the case currently.
>>
>>> +    .features = BIT(DPU_DSC_HW_REV_1_2) | _features, \
>>
>> We don't willy-nilly append bits like that: should there be global
>> feature flags?
> 
> So this approach is actually better. This macro is a DSC_1_2 macro so it 
> will have the 1.2 feature flag and other features like native_422 
> support of that encoder are ORed on top of it. Nothing wrong with this.
> 
>>
>> Or is this the start of a new era where we expand those defines in-line
>> and drop them altogether?  I'd much prefer that but we should first
>> align on this direction (and then also make the switch globally in a
>> followup).
>>
> 
> Its case by case. No need to generalize.
> 
> In this the feature flag ORed with the base feature flag of DSC_1_2 
> makes it more clear.
> 
>> - Marijn
>>
>>> +    .sblk = &_sblk, \
>>> +    }
>>> +
>>>   /*************************************************************
>>>    * INTF sub blocks config
>>>    *************************************************************/
>>> -- 
>>> 2.7.4
>>>
Marijn Suijten May 15, 2023, 10:23 p.m. UTC | #11
On 2023-05-15 15:03:46, Abhinav Kumar wrote:
> On 5/15/2023 2:21 PM, Marijn Suijten wrote:
> > On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
> >>
> >> From: Abhinav Kumar <quic_abhinavk@quicinc.com>
> >>
> >> Add DSC 1.2 hardware blocks to the catalog with necessary sub-block and
> >> feature flag information.  Each display compression engine (DCE) contains
> >> dual hard slice DSC encoders so both share same base address but with
> >> its own different sub block address.
> > 
> > Can we have an explanation of hard vs soft slices in some commit message
> > and/or code documentation?
> > 
> 
> Not in this one. It wont look appropriate. I would rather remove "hard" 
> to avoid confusion.

That is totally fine, let's remove it instead.

<snip>
> >> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
> > 
> > Downstream says that the size is 0x10 (and 0x100 for the enc sblk, 0x10
> > for the ctl sblk).  This simply fills it up to the start of the enc sblk
> > so that we can see all registers in the dump?  After all only
> > DSC_CMN_MAIN_CNF is defined in the main register space, so 0x10 is
> > adequate.
> > 
> 
> .len today is always only for the dump. and yes even here we have only 
> 0x100 for the enc and 0x10 for the ctl.
> 
> +static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
> +	.enc = {.base = 0x100, .len = 0x100},
> +	.ctl = {.base = 0xF00, .len = 0x10},
> +};
> 
> The issue here is that, the dpu snapshot does not handle sub_blk parsing 
> today. Its a to-do item. So for that reason, 0x100 was used here to 
> atleast get the full encoder dumps.

But then you don't see the ENC block?  It starts at 0x100 (or 0x200) so
then the length should be longer... it should in fact depend on even/odd
DCE then?

> 
> >> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
> > 
> > Should we add an extra suffix to the name to indicate which hard-slice
> > DSC encoder it is?  i.e. "dce_0_0" and "dce_0_1" etc?
> 
> Ok, that should be fine. We can add it.

Great, thanks!

> >> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> >> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
> > 
> 
> > See comment below about loose BIT() in features.
> 
> Responded below.
> > 
> >> +};
> >> +
> >>   static const struct dpu_intf_cfg sm8350_intf[] = {
> >>   	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
> >>   			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> >> @@ -215,6 +227,8 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
> >>   	.dspp = sm8350_dspp,
> >>   	.pingpong_count = ARRAY_SIZE(sm8350_pp),
> >>   	.pingpong = sm8350_pp,
> >> +	.dsc = sm8350_dsc,
> >> +	.dsc_count = ARRAY_SIZE(sm8350_dsc),
> > 
> > Count goes first **everywhere else**, let's not break consistency here.
> > 
> 
> the order of DSC entries is swapped for all chipsets. Please refer to 
> dpu_sc8180x_cfg, dpu_sm8250_cfg etc.

Thanks for confirming that this is not the case in a followup mail :)

> So if you are talking about consistency, this is actually consistent 
> with whats present in other chipsets.
> 
> If you are very particular about this, then once this lands, you can 
> change the order for all of them in another change.
> 
> Same answer for all swap comments.
<snip>
> >> +/*
> >> + * NOTE: Each display compression engine (DCE) contains dual hard
> >> + * slice DSC encoders so both share same base address but with
> >> + * its own different sub block address.
> >> + */
> >> +#define DSC_BLK_1_2(_name, _id, _base, _len, _features, _sblk) \
> > 
> > There are no address values here so this comment doesn't seem very
> > useful, and it is already duplicated on every DSC block array, where the
> > duplication is more visible.  Drop the comment here?
> > 
> 
> _base is the address. So base address. Does that clarify things?

This is referring to the NOTE: comment above.  There's _base as address
here, yes, but there's no context here that it'll be used in duplicate
fashion, unlike the SoC catalog files.  The request is to just drop it
here as it adds no value.

> >> +	{\
> >> +	.name = _name, .id = _id, \
> >> +	.base = _base, .len = _len, \
> > 
> > The len is always 0x100 (downstream says 0x10), should we hardcode it
> > here and drop _len?  We can always add it back if a future revision
> > starts changing it, but that's not the case currently.
> > 
> >> +	.features = BIT(DPU_DSC_HW_REV_1_2) | _features, \
> > 
> > We don't willy-nilly append bits like that: should there be global
> > feature flags?
> 
> So this approach is actually better. This macro is a DSC_1_2 macro so it 
> will have the 1.2 feature flag and other features like native_422 
> support of that encoder are ORed on top of it. Nothing wrong with this.

I agree it is better, but we seem to be very selective in whether to
stick to the "old" principles in DPU versus applying a new pattern that
isn't used elsewhere yet (i.e. your request to _not_ shuffle the order
of .dsc and .dsc_count assignment to match other .x and .x_count, and do
that in a future patch instead).  If we want to be consistent
everywhere, these should be #defines that we'll flatten out in a
followup if at all.

> > Or is this the start of a new era where we expand those defines in-line
> > and drop them altogether?  I'd much prefer that but we should first
> > align on this direction (and then also make the switch globally in a
> > followup).
> > 
> 
> Its case by case. No need to generalize.
> 
> In this the feature flag ORed with the base feature flag of DSC_1_2 
> makes it more clear.

- Marijn
Abhinav Kumar May 15, 2023, 11:58 p.m. UTC | #12
On 5/15/2023 3:23 PM, Marijn Suijten wrote:
> On 2023-05-15 15:03:46, Abhinav Kumar wrote:
>> On 5/15/2023 2:21 PM, Marijn Suijten wrote:
>>> On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
>>>>
>>>> From: Abhinav Kumar <quic_abhinavk@quicinc.com>
>>>>
>>>> Add DSC 1.2 hardware blocks to the catalog with necessary sub-block and
>>>> feature flag information.  Each display compression engine (DCE) contains
>>>> dual hard slice DSC encoders so both share same base address but with
>>>> its own different sub block address.
>>>
>>> Can we have an explanation of hard vs soft slices in some commit message
>>> and/or code documentation?
>>>
>>
>> Not in this one. It wont look appropriate. I would rather remove "hard"
>> to avoid confusion.
> 
> That is totally fine, let's remove it instead.
> 
> <snip>
>>>> +	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>>>
>>> Downstream says that the size is 0x10 (and 0x100 for the enc sblk, 0x10
>>> for the ctl sblk).  This simply fills it up to the start of the enc sblk
>>> so that we can see all registers in the dump?  After all only
>>> DSC_CMN_MAIN_CNF is defined in the main register space, so 0x10 is
>>> adequate.
>>>
>>
>> .len today is always only for the dump. and yes even here we have only
>> 0x100 for the enc and 0x10 for the ctl.
>>
>> +static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
>> +	.enc = {.base = 0x100, .len = 0x100},
>> +	.ctl = {.base = 0xF00, .len = 0x10},
>> +};
>>
>> The issue here is that, the dpu snapshot does not handle sub_blk parsing
>> today. Its a to-do item. So for that reason, 0x100 was used here to
>> atleast get the full encoder dumps.
> 
> But then you don't see the ENC block?  It starts at 0x100 (or 0x200) so
> then the length should be longer... it should in fact depend on even/odd
> DCE then?
> 

You are right that the length should be longer. It should be 0x29c then 
and ctl blocks will not be included anyway.

The fundamental issue which remains despite increasing the length will 
be that the two blocks will print duplicate information. So dce_0_0 and 
dce_0_1 will print the same information twice as the base address is the 
same.

Odd/even DCE intelligence is not there in these macros and will be an 
overkill to just support the dump.

So overall, I dont think any of it is a good solution yet.

I think the best way to do this will be to add sub-block parsing support 
to the DPU devcoredump.

So what will happen is similar to downstream design in sde_dbg, when a 
block has sub-blocks we will respect the sub-block's len and not the 
parent block's as that will be more accurate.

If 0x29c is going to help the cause till then we can change it.

>>
>>>> +	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>>>
>>> Should we add an extra suffix to the name to indicate which hard-slice
>>> DSC encoder it is?  i.e. "dce_0_0" and "dce_0_1" etc?
>>
>> Ok, that should be fine. We can add it.
> 
> Great, thanks!
> 
>>>> +	DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>>>> +	DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>>>
>>
>>> See comment below about loose BIT() in features.
>>
>> Responded below.
>>>
>>>> +};
>>>> +
>>>>    static const struct dpu_intf_cfg sm8350_intf[] = {
>>>>    	INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>>>>    			DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>>>> @@ -215,6 +227,8 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
>>>>    	.dspp = sm8350_dspp,
>>>>    	.pingpong_count = ARRAY_SIZE(sm8350_pp),
>>>>    	.pingpong = sm8350_pp,
>>>> +	.dsc = sm8350_dsc,
>>>> +	.dsc_count = ARRAY_SIZE(sm8350_dsc),
>>>
>>> Count goes first **everywhere else**, let's not break consistency here.
>>>
>>
>> the order of DSC entries is swapped for all chipsets. Please refer to
>> dpu_sc8180x_cfg, dpu_sm8250_cfg etc.
> 
> Thanks for confirming that this is not the case in a followup mail :)
> 
>> So if you are talking about consistency, this is actually consistent
>> with whats present in other chipsets.
>>
>> If you are very particular about this, then once this lands, you can
>> change the order for all of them in another change.
>>
>> Same answer for all swap comments.
> <snip>
>>>> +/*
>>>> + * NOTE: Each display compression engine (DCE) contains dual hard
>>>> + * slice DSC encoders so both share same base address but with
>>>> + * its own different sub block address.
>>>> + */
>>>> +#define DSC_BLK_1_2(_name, _id, _base, _len, _features, _sblk) \
>>>
>>> There are no address values here so this comment doesn't seem very
>>> useful, and it is already duplicated on every DSC block array, where the
>>> duplication is more visible.  Drop the comment here?
>>>
>>
>> _base is the address. So base address. Does that clarify things?
> 
> This is referring to the NOTE: comment above.  There's _base as address
> here, yes, but there's no context here that it'll be used in duplicate
> fashion, unlike the SoC catalog files.  The request is to just drop it
> here as it adds no value.
>

The duplication is there. the base is same for both the entries with dce_0.

+static const struct dpu_dsc_cfg sc8280xp_dsc[] = {
+	DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
+	DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),

Both the blks use 0x80000 as base address and the note is just telling that.

>>>> +	{\
>>>> +	.name = _name, .id = _id, \
>>>> +	.base = _base, .len = _len, \
>>>
>>> The len is always 0x100 (downstream says 0x10), should we hardcode it
>>> here and drop _len?  We can always add it back if a future revision
>>> starts changing it, but that's not the case currently.
>>>
>>>> +	.features = BIT(DPU_DSC_HW_REV_1_2) | _features, \
>>>
>>> We don't willy-nilly append bits like that: should there be global
>>> feature flags?
>>
>> So this approach is actually better. This macro is a DSC_1_2 macro so it
>> will have the 1.2 feature flag and other features like native_422
>> support of that encoder are ORed on top of it. Nothing wrong with this.
> 
> I agree it is better, but we seem to be very selective in whether to
> stick to the "old" principles in DPU versus applying a new pattern that
> isn't used elsewhere yet (i.e. your request to _not_ shuffle the order
> of .dsc and .dsc_count assignment to match other .x and .x_count, and do
> that in a future patch instead).  If we want to be consistent
> everywhere, these should be #defines that we'll flatten out in a
> followup if at all.
> 

Yes, if it the order was already swapped, then we could have maintained 
it for this patch and cleaned all of them up together. Nothing wrong in 
that approach.

But I already clarified that was a mistake. The order of dsc and 
dsc_count is not swapped so I agreed to do that here. So where is the 
inconsistency?

>>> Or is this the start of a new era where we expand those defines in-line
>>> and drop them altogether?  I'd much prefer that but we should first
>>> align on this direction (and then also make the switch globally in a
>>> followup).
>>>
>>
>> Its case by case. No need to generalize.
>>
>> In this the feature flag ORed with the base feature flag of DSC_1_2
>> makes it more clear.
> 
> - Marijn