[v2,00/22] drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+

Message ID	20240924-concurrent-wb-v2-0-7849f900e863@quicinc.com
Headers	show Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4449A186E5E; Tue, 24 Sep 2024 23:00:16 +0000 (UTC) From: Jessica Zhang <quic_jesszhan@quicinc.com> Subject: [PATCH v2 00/22] drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+ Date: Tue, 24 Sep 2024 15:59:16 -0700 Message-ID: <20240924-concurrent-wb-v2-0-7849f900e863@quicinc.com> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit To: Rob Clark <robdclark@gmail.com>, Dmitry Baryshkov <dmitry.baryshkov@linaro.org>, <quic_abhinavk@quicinc.com>, Sean Paul <sean@poorly.run>, Marijn Suijten <marijn.suijten@somainline.org>, "David Airlie" <airlied@gmail.com>, Daniel Vetter <daniel@ffwll.ch>, "Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>, Maxime Ripard <mripard@kernel.org>, Thomas Zimmermann <tzimmermann@suse.de> CC: <quic_ebharadw@quicinc.com>, <linux-arm-msm@vger.kernel.org>, <dri-devel@lists.freedesktop.org>, <freedreno@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>, Rob Clark <robdclark@chromium.org>, =?utf-8?b?VmlsbGUgU3lyasOkbMOk?= <ville.syrjala@linux.intel.com>, "Jessica Zhang" <quic_jesszhan@quicinc.com>
Series	drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+ \| expand [v2,00/22] drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+ [v2,02/22] drm: Add valid clones check [v2,03/22] drm/msm/dpu: get rid of struct dpu_rm_requirements [v2,07/22] drm/msm/dpu: Add CWB entry to catalog for SM8650 [v2,08/22] drm/msm/dpu: Specify dedicated CWB pingpong blocks [v2,09/22] drm/msm/dpu: add devcoredumps for cwb registers [v2,13/22] drm/msm/dpu: Add CWB to msm_display_topology [v2,15/22] drm/msm/dpu: Reserve resources for CWB [v2,16/22] drm/msm/dpu: Configure CWB in writeback encoder [v2,17/22] drm/msm/dpu: Support CWB in dpu_hw_ctl [v2,19/22] drm/msm/dpu: Start frame done timer after encoder kickoff [v2,22/22] drm/msm/dpu: Set possible clones for all encoders

Message ID

20240924-concurrent-wb-v2-0-7849f900e863@quicinc.com

Headers

From: Jessica Zhang <quic_jesszhan@quicinc.com>
Subject: [PATCH v2 00/22] drm/msm/dpu: Add Concurrent Writeback Support for
 DPU 10.x+
Date: Tue, 24 Sep 2024 15:59:16 -0700
Message-ID: <20240924-concurrent-wb-v2-0-7849f900e863@quicinc.com>
Precedence: bulk
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: Rob Clark <robdclark@gmail.com>,
        Dmitry Baryshkov
	<dmitry.baryshkov@linaro.org>,
        <quic_abhinavk@quicinc.com>, Sean Paul
	<sean@poorly.run>,
        Marijn Suijten <marijn.suijten@somainline.org>,
        "David
 Airlie" <airlied@gmail.com>, Daniel Vetter <daniel@ffwll.ch>,
        "Maarten
 Lankhorst" <maarten.lankhorst@linux.intel.com>,
        Maxime Ripard
	<mripard@kernel.org>,
        Thomas Zimmermann <tzimmermann@suse.de>
CC: <quic_ebharadw@quicinc.com>, <linux-arm-msm@vger.kernel.org>,
 <dri-devel@lists.freedesktop.org>, <freedreno@lists.freedesktop.org>,
 <linux-kernel@vger.kernel.org>, Rob Clark <robdclark@chromium.org>,
	=?utf-8?b?VmlsbGUgU3lyasOkbMOk?= <ville.syrjala@linux.intel.com>,
 "Jessica Zhang" <quic_jesszhan@quicinc.com>

Series

drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+ | expand

Message

Jessica Zhang Sept. 24, 2024, 10:59 p.m. UTC

DPU supports a single writeback session running concurrently with primary
display when the CWB mux is configured properly. This series enables
clone mode for DPU driver and adds support for programming the CWB mux
in cases where the hardware has dedicated CWB pingpong blocks. Currently,
the CWB hardware blocks have only been added to the SM8650
hardware catalog.

This changes are split into two parts:

The first part of the series will pull in Dmitry's patches to refactor
the DPU resource manager to be based off of CRTC instead of encoder.
This includes some changes (noted in the relevant commits) by me and
Abhinav to fix some issues with getting the global state and refactoring
the CDM allocation to work with Dmitry's changes.

The second part of the series will add support for CWB by doing the
following:

1) Add a DRM helper to detect if the current CRTC state is in clone mode
   and add an "in_clone_mode" entry to the atomic state print
2) Add the CWB mux to the hardware catalog and clarify the pingpong
   block index enum to specifiy which pingpong blocks are dedicated to
   CWB only and which ones are general use pingpong blocks
3) Add CWB as part of the devcoredump
4) Add support for configuring the CWB mux via dpu_hw_cwb ops
5) Add pending flush support for CWB
6) Add support for validating clone mode in the DPU CRTC and setting up
   CWB within the encoder
7) Adjust the encoder trigger flush, trigger start, and kickoff order to
   accomodate clone mode
8) Adjust when the frame done timer is started for clone mode
9) Define the possible clones for DPU encoders so that 

The feature was tested on SM8650 using IGT's kms_writeback test with the
following change [1] and dumping the writeback framebuffer when in clone
mode. I haven't gotten the chance to test it on DP yet, but I've
validated both single and dual LM on DSI.

To test CWB with IGT, you'll need to apply this series [1] and run
the following command:

IGT_FRAME_DUMP_PATH=<dump path> FRAME_PNG_FILE_NAME=<file name> \
./build/tests/kms_writeback -d [--run-subtest dump-valid-clones] \

[1] https://patchwork.freedesktop.org/series/137933/

---
Changes in v2:
- Moved CWB hardware programming to its own dpu_hw_cwb abstraction
  (Dmitry)
- Reserve and get assigned CWB muxes using RM API and KMS global state
  (Dmitry)
- Dropped requirement to have only one CWB session at a time
- Moved valid clone mode check to DRM framework (Dmitry and Ville)
- Switch to default CWB tap point to LM as the DSPP
- Dropped printing clone mode status in atomic state (Dmitry)
- Call dpu_vbif_clear_errors() before dpu_encoder_kickoff() (Dmitry)
- Squashed setup_input_ctrl() and setup_input_mode() into a single
  dpu_hw_cwb op (Dmitry)
- Moved function comment docs to correct place and fixed wording of
  comments/commit messages (Dmitry)
- Grabbed old CRTC state using proper drm_atomic_state API in
  dpu_crtc_atomic_check() (Dmitry)
- Split HW catalog changes of adding the CWB mux block and changing the
  dedicated CWB pingpong indices into 2 separate commits (Dmitry)
- Moved clearing the dpu_crtc_state.num_mixers to "drm/msm/dpu: fill
  CRTC resources in dpu_crtc.c" (Dmitry)
- Fixed alignment and other formatting issues (Dmitry)
- Link to v1: https://lore.kernel.org/r/20240829-concurrent-wb-v1-0-502b16ae2ebb@quicinc.com

---
Dmitry Baryshkov (4):
      drm/msm/dpu: get rid of struct dpu_rm_requirements
      drm/msm/dpu: switch RM to use crtc_id rather than enc_id for allocation
      drm/msm/dpu: move resource allocation to CRTC
      drm/msm/dpu: fill CRTC resources in dpu_crtc.c

Esha Bharadwaj (3):
      drm/msm/dpu: Add CWB entry to catalog for SM8650
      drm/msm/dpu: add devcoredumps for cwb registers
      drm/msm/dpu: add CWB support to dpu_hw_wb

Jessica Zhang (15):
      drm: add clone mode check for CRTC
      drm: Add valid clones check
      drm/msm/dpu: Specify dedicated CWB pingpong blocks
      drm/msm/dpu: Add dpu_hw_cwb abstraction for CWB block
      drm/msm/dpu: Add RM support for allocating CWB
      drm/msm/dpu: Add CWB to msm_display_topology
      drm/msm/dpu: Require modeset if clone mode status changes
      drm/msm/dpu: Reserve resources for CWB
      drm/msm/dpu: Configure CWB in writeback encoder
      drm/msm/dpu: Support CWB in dpu_hw_ctl
      drm/msm/dpu: Adjust writeback phys encoder setup for CWB
      drm/msm/dpu: Start frame done timer after encoder kickoff
      drm/msm/dpu: Skip trigger flush and start for CWB
      drm/msm/dpu: Reorder encoder kickoff for CWB
      drm/msm/dpu: Set possible clones for all encoders

 drivers/gpu/drm/drm_atomic_helper.c                |  23 ++
 drivers/gpu/drm/msm/Makefile                       |   1 +
 .../drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h    |  29 +-
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h |   4 +-
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h |   4 +-
 .../drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h   |   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c           | 200 +++++++++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        | 434 +++++++++++++--------
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h        |  25 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |  16 +-
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c    |  16 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |  13 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c         |  30 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h         |  15 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.c         |  73 ++++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h         |  70 ++++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h        |  15 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c          |   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c            |  12 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h            |  13 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             | 354 ++++++++++-------
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h             |  16 +-
 drivers/gpu/drm/msm/msm_drv.h                      |   2 +
 include/drm/drm_crtc.h                             |   7 +
 24 files changed, 1025 insertions(+), 355 deletions(-)
---
base-commit: 40227086b02f4b36742db313ba98bb51ca325571
change-id: 20240618-concurrent-wb-97d62387f952

Best regards,

Comments

Jessica Zhang Sept. 25, 2024, 8:47 p.m. UTC | #1

On 9/25/2024 1:12 AM, Jani Nikula wrote:
> On Tue, 24 Sep 2024, Jessica Zhang <quic_jesszhan@quicinc.com> wrote:
>> Add helper to check if the given CRTC state is in clone mode
>>
>> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
>> ---
>>   include/drm/drm_crtc.h | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
>> index 8b48a1974da3..ecb93e2c4afc 100644
>> --- a/include/drm/drm_crtc.h
>> +++ b/include/drm/drm_crtc.h
>> @@ -1323,5 +1323,12 @@ static inline struct drm_crtc *drm_crtc_find(struct drm_device *dev,
>>   
>>   int drm_crtc_create_scaling_filter_property(struct drm_crtc *crtc,
>>   					    unsigned int supported_filters);
>> +static inline bool drm_crtc_in_clone_mode(struct drm_crtc_state *crtc_state)
>> +{
>> +	if (!crtc_state)
>> +		return false;
>> +
>> +	return hweight32(crtc_state->encoder_mask) > 1;
>> +}
> 
> What's the benefit of this being static inline?
> 
> You're implicitly depending on hweight32() being available, basically
> <linux/bitops.h> being included. Maybe it already is, but it's the
> accumulation of small and innocent looking things like this that then
> explode the header dependencies, and make them harder to reduce.

Hi Jani,

Good point, I'll move the implementation to drm_crtc.c.

Thanks,

Jessica Zhang

> 
> BR,
> Jani.
> 
>>   
>>   #endif /* __DRM_CRTC_H__ */
> 
> -- 
> Jani Nikula, Intel

Abhinav Kumar Sept. 25, 2024, 9:49 p.m. UTC | #2

On 9/25/2024 2:11 PM, Dmitry Baryshkov wrote:
> On Wed, 25 Sept 2024 at 22:39, Jessica Zhang <quic_jesszhan@quicinc.com> wrote:
>>
>>
>>
>> On 9/24/2024 4:13 PM, Dmitry Baryshkov wrote:
>>> On Tue, Sep 24, 2024 at 03:59:21PM GMT, Jessica Zhang wrote:
>>>> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>>>
>>>> All resource allocation is centered around the LMs. Then other blocks
>>>> (except DSCs) are allocated basing on the LMs that was selected, and LM
>>>> powers up the CRTC rather than the encoder.
>>>>
>>>> Moreover if at some point the driver supports encoder cloning,
>>>> allocating resources from the encoder will be incorrect, as all clones
>>>> will have different encoder IDs, while LMs are to be shared by these
>>>> encoders.
>>>>
>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>>> [quic_abhinavk@quicinc.com: Refactored resource allocation for CDM]
>>>> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
>>>> [quic_jesszhan@quicinc.com: Changed to grabbing exising global state]
>>>> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
>>>> ---
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c    |  86 ++++++++++++
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 201 +++++++++++-----------------
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h |  19 +++
>>>>    3 files changed, 183 insertions(+), 123 deletions(-)
>>>>
>>>> @@ -544,159 +542,117 @@ void dpu_encoder_helper_split_config(
>>>>       }
>>>>    }
>>>>
>>>> -bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc)
>>>> +void dpu_encoder_update_topology(struct drm_encoder *drm_enc,
>>>> +                             struct msm_display_topology *topology,
>>>> +                             struct drm_atomic_state *state,
>>>> +                             const struct drm_display_mode *adj_mode)
>>>>    {
>>>>       struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
>>>> -    int i, intf_count = 0, num_dsc = 0;
>>>> +    struct drm_connector *connector;
>>>> +    struct drm_connector_state *conn_state;
>>>> +    struct msm_display_info *disp_info;
>>>> +    struct drm_framebuffer *fb;
>>>> +    struct msm_drm_private *priv;
>>>> +    int i;
>>>>
>>>>       for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
>>>>               if (dpu_enc->phys_encs[i])
>>>> -                    intf_count++;
>>>> +                    topology->num_intf++;
>>>>
>>>> -    /* See dpu_encoder_get_topology, we only support 2:2:1 topology */
>>>> +    /* We only support 2 DSC mode (with 2 LM and 1 INTF) */
>>>>       if (dpu_enc->dsc)
>>>> -            num_dsc = 2;
>>>> +            topology->num_dsc += 2;
>>>>
>>>> -    return (num_dsc > 0) && (num_dsc > intf_count);
>>>> -}
>>>> +    connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc);
>>>> +    if (!connector)
>>>> +            return;
>>>> +    conn_state = drm_atomic_get_new_connector_state(state, connector);
>>>> +    if (!conn_state)
>>>> +            return;
>>>>
>>>> -struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc)
>>>> -{
>>>> -    struct msm_drm_private *priv = drm_enc->dev->dev_private;
>>>> -    struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
>>>> -    int index = dpu_enc->disp_info.h_tile_instance[0];
>>>> +    disp_info = &dpu_enc->disp_info;
>>>>
>>>> -    if (dpu_enc->disp_info.intf_type == INTF_DSI)
>>>> -            return msm_dsi_get_dsc_config(priv->dsi[index]);
>>>> +    priv = drm_enc->dev->dev_private;
>>>>
>>>> -    return NULL;
>>>> +    /*
>>>> +     * Use CDM only for writeback or DP at the moment as other interfaces cannot handle it.
>>>> +     * If writeback itself cannot handle cdm for some reason it will fail in its atomic_check()
>>>> +     * earlier.
>>>> +     */
>>>> +    if (disp_info->intf_type == INTF_WB && conn_state->writeback_job) {
>>>> +            fb = conn_state->writeback_job->fb;
>>>> +
>>>> +            if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb)))
>>>> +                    topology->needs_cdm = true;
>>>> +    } else if (disp_info->intf_type == INTF_DP) {
>>>> +            if (msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]], adj_mode))
>>>> +                    topology->needs_cdm = true;
>>>> +    }
>>>
>>> Just to note, the needs_cdm is not enough once you introduce CWB
>>> support. E.g. DP/YUV420 + WB/YUV case requires two CDM blocks (which we
>>> don't have), but this doesn't get reflected in the topology.
>>
>> Hi Dmitry,
>>
>> Ack. I can add something to make atomic_check fail if the input FB is
>> YUV format and CWB is enabled.
> 
> I'd prefer for this to be more natural rather than just checking for
> the DP && DP_YUV420 && WB && WB_FMT_YUV. In the worst case, count CDM
> requests and then in RM check them against the catalog. But I had a
> more logical (although more intrusive) implementation on my mind:
> 
> struct msm_display_topology {
>      struct {
>        u32 num_intf;
>        u32 num_wb;
>        u32 num_dsc;
>        bool needs_cdm;
>      } outputs[MAX_OUTPUTS];
>      u32 num_lm;
> };
> 
> WDYT?
> 

the struct msm_display_topology was originally designed as a per-encoder 
struct (dpu_encoder_get_topology() indicates the same). Making this an 
array of outputs was not needed as there is expected to be one struct 
msm_display_topology for each virt encoder's requested topology and the 
blocks inside of it other than LM, are "encoder" hw blocks.

needs_cdm was made a boolean instead of a num_cdm_count like other 
hardware blocks because till the most recent chipset, we have only one 
CDM block. Whenever we do have more CDM blocks why will introducing 
num_cdm to the topology struct not solve your problem rather than making 
it an array of outputs?

Because then, RM will know that the request exceeds the max blocks.

I think what you are trying to do now is make struct 
msm_display_topology's encoder parts per-encoder and rest like num_lm 
per "RM session".

The thought is not wrong but at the same time seems a bit of an overkill 
because its mostly already like that. Apart from CDM for which I have no 
indication of another one getting added, rest of the blocks are already 
aligned towards a per-encoder model and not a "RM session" model.

Even if we end up doing it this way, most of the model is kind of unused 
really because each encoder will request its own topology anyway, there 
is just no aggregation for CDM which at this point is not needed as 
there is no HW we are aware of needing this.

I think the atomic_check validation is needed either way because if two 
encoders request cdm, we cannot do clone mode as there is only one cdm 
block today. Its just that we are not tracking num_cdm today due to 
reasons explained above but basically doing something like below seems 
right to me:

if (enc_is_in_clone_mode && needs_cdm)
	return -ENOTSUPPORTED;

When we add more cdm_blocks, we can drop this check and making needs_cdm 
a num_cdm will make it naturally fail.

>>
>> Thanks,
>>
>> Jessica Zhang
>>
>>>
>>>>    }
>>>>
>>> --
>>> With best wishes
>>> Dmitry
>>
> 
>

Dmitry Baryshkov Sept. 26, 2024, 7:12 a.m. UTC | #3

On Wed, Sep 25, 2024 at 02:49:48PM GMT, Abhinav Kumar wrote:
> On 9/25/2024 2:11 PM, Dmitry Baryshkov wrote:
> > On Wed, 25 Sept 2024 at 22:39, Jessica Zhang <quic_jesszhan@quicinc.com> wrote:
> > > On 9/24/2024 4:13 PM, Dmitry Baryshkov wrote:
> > > > On Tue, Sep 24, 2024 at 03:59:21PM GMT, Jessica Zhang wrote:
> > > > > From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > > > > 
> > > > > All resource allocation is centered around the LMs. Then other blocks
> > > > > (except DSCs) are allocated basing on the LMs that was selected, and LM
> > > > > powers up the CRTC rather than the encoder.
> > > > > 
> > > > > Moreover if at some point the driver supports encoder cloning,
> > > > > allocating resources from the encoder will be incorrect, as all clones
> > > > > will have different encoder IDs, while LMs are to be shared by these
> > > > > encoders.
> > > > > 
> > > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > > > > [quic_abhinavk@quicinc.com: Refactored resource allocation for CDM]
> > > > > Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
> > > > > [quic_jesszhan@quicinc.com: Changed to grabbing exising global state]
> > > > > Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
> > > > > ---
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c    |  86 ++++++++++++
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 201 +++++++++++-----------------
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h |  19 +++
> > > > >    3 files changed, 183 insertions(+), 123 deletions(-)
> > > > > 
> > > > > @@ -544,159 +542,117 @@ void dpu_encoder_helper_split_config(
> > > > >       }
> > > > >    }
> > > > > 
> > > > > -bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc)
> > > > > +void dpu_encoder_update_topology(struct drm_encoder *drm_enc,
> > > > > +                             struct msm_display_topology *topology,
> > > > > +                             struct drm_atomic_state *state,
> > > > > +                             const struct drm_display_mode *adj_mode)
> > > > >    {
> > > > >       struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
> > > > > -    int i, intf_count = 0, num_dsc = 0;
> > > > > +    struct drm_connector *connector;
> > > > > +    struct drm_connector_state *conn_state;
> > > > > +    struct msm_display_info *disp_info;
> > > > > +    struct drm_framebuffer *fb;
> > > > > +    struct msm_drm_private *priv;
> > > > > +    int i;
> > > > > 
> > > > >       for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
> > > > >               if (dpu_enc->phys_encs[i])
> > > > > -                    intf_count++;
> > > > > +                    topology->num_intf++;
> > > > > 
> > > > > -    /* See dpu_encoder_get_topology, we only support 2:2:1 topology */
> > > > > +    /* We only support 2 DSC mode (with 2 LM and 1 INTF) */
> > > > >       if (dpu_enc->dsc)
> > > > > -            num_dsc = 2;
> > > > > +            topology->num_dsc += 2;
> > > > > 
> > > > > -    return (num_dsc > 0) && (num_dsc > intf_count);
> > > > > -}
> > > > > +    connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc);
> > > > > +    if (!connector)
> > > > > +            return;
> > > > > +    conn_state = drm_atomic_get_new_connector_state(state, connector);
> > > > > +    if (!conn_state)
> > > > > +            return;
> > > > > 
> > > > > -struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc)
> > > > > -{
> > > > > -    struct msm_drm_private *priv = drm_enc->dev->dev_private;
> > > > > -    struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
> > > > > -    int index = dpu_enc->disp_info.h_tile_instance[0];
> > > > > +    disp_info = &dpu_enc->disp_info;
> > > > > 
> > > > > -    if (dpu_enc->disp_info.intf_type == INTF_DSI)
> > > > > -            return msm_dsi_get_dsc_config(priv->dsi[index]);
> > > > > +    priv = drm_enc->dev->dev_private;
> > > > > 
> > > > > -    return NULL;
> > > > > +    /*
> > > > > +     * Use CDM only for writeback or DP at the moment as other interfaces cannot handle it.
> > > > > +     * If writeback itself cannot handle cdm for some reason it will fail in its atomic_check()
> > > > > +     * earlier.
> > > > > +     */
> > > > > +    if (disp_info->intf_type == INTF_WB && conn_state->writeback_job) {
> > > > > +            fb = conn_state->writeback_job->fb;
> > > > > +
> > > > > +            if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb)))
> > > > > +                    topology->needs_cdm = true;
> > > > > +    } else if (disp_info->intf_type == INTF_DP) {
> > > > > +            if (msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]], adj_mode))
> > > > > +                    topology->needs_cdm = true;
> > > > > +    }
> > > > 
> > > > Just to note, the needs_cdm is not enough once you introduce CWB
> > > > support. E.g. DP/YUV420 + WB/YUV case requires two CDM blocks (which we
> > > > don't have), but this doesn't get reflected in the topology.
> > > 
> > > Hi Dmitry,
> > > 
> > > Ack. I can add something to make atomic_check fail if the input FB is
> > > YUV format and CWB is enabled.
> > 
> > I'd prefer for this to be more natural rather than just checking for
> > the DP && DP_YUV420 && WB && WB_FMT_YUV. In the worst case, count CDM
> > requests and then in RM check them against the catalog. But I had a
> > more logical (although more intrusive) implementation on my mind:
> > 
> > struct msm_display_topology {
> >      struct {
> >        u32 num_intf;
> >        u32 num_wb;
> >        u32 num_dsc;
> >        bool needs_cdm;
> >      } outputs[MAX_OUTPUTS];
> >      u32 num_lm;
> > };
> > 
> > WDYT?
> > 
> 
> the struct msm_display_topology was originally designed as a per-encoder
> struct (dpu_encoder_get_topology() indicates the same). Making this an array
> of outputs was not needed as there is expected to be one struct
> msm_display_topology for each virt encoder's requested topology and the
> blocks inside of it other than LM, are "encoder" hw blocks.
> 
> needs_cdm was made a boolean instead of a num_cdm_count like other hardware
> blocks because till the most recent chipset, we have only one CDM block.
> Whenever we do have more CDM blocks why will introducing num_cdm to the
> topology struct not solve your problem rather than making it an array of
> outputs?
> 
> Because then, RM will know that the request exceeds the max blocks.
> 
> I think what you are trying to do now is make struct msm_display_topology's
> encoder parts per-encoder and rest like num_lm per "RM session".
> 
> The thought is not wrong but at the same time seems a bit of an overkill
> because its mostly already like that. Apart from CDM for which I have no
> indication of another one getting added, rest of the blocks are already
> aligned towards a per-encoder model and not a "RM session" model.

But we should be leaning towards RM session.

> 
> Even if we end up doing it this way, most of the model is kind of unused
> really because each encoder will request its own topology anyway, there is
> just no aggregation for CDM which at this point is not needed as there is no
> HW we are aware of needing this.

With the resource allocation shifted to the CRTC individual encoders
do not request their own topology (as it is now a property of the full
output pipeline, not just an encoder). Yes, CDM aggregation into num_cdm
seems unnecessary as there is just one CDM block.

> I think the atomic_check validation is needed either way because if two
> encoders request cdm, we cannot do clone mode as there is only one cdm block
> today. Its just that we are not tracking num_cdm today due to reasons
> explained above but basically doing something like below seems right to me:
> 
> if (enc_is_in_clone_mode && needs_cdm)
> 	return -ENOTSUPPORTED;

This check is incorrect in my opinion. The hardware should be able to
support DP/YUV420 + WB/RGB and DP/RGB + WB/YUV combinations. Please
correct me if I'm wrong.

> When we add more cdm_blocks, we can drop this check and making needs_cdm a
> num_cdm will make it naturally fail.