Message ID | 20240924-concurrent-wb-v2-0-7849f900e863@quicinc.com |
---|---|
Headers | show |
Series | drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+ | expand |
On 9/25/2024 1:12 AM, Jani Nikula wrote: > On Tue, 24 Sep 2024, Jessica Zhang <quic_jesszhan@quicinc.com> wrote: >> Add helper to check if the given CRTC state is in clone mode >> >> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> >> --- >> include/drm/drm_crtc.h | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h >> index 8b48a1974da3..ecb93e2c4afc 100644 >> --- a/include/drm/drm_crtc.h >> +++ b/include/drm/drm_crtc.h >> @@ -1323,5 +1323,12 @@ static inline struct drm_crtc *drm_crtc_find(struct drm_device *dev, >> >> int drm_crtc_create_scaling_filter_property(struct drm_crtc *crtc, >> unsigned int supported_filters); >> +static inline bool drm_crtc_in_clone_mode(struct drm_crtc_state *crtc_state) >> +{ >> + if (!crtc_state) >> + return false; >> + >> + return hweight32(crtc_state->encoder_mask) > 1; >> +} > > What's the benefit of this being static inline? > > You're implicitly depending on hweight32() being available, basically > <linux/bitops.h> being included. Maybe it already is, but it's the > accumulation of small and innocent looking things like this that then > explode the header dependencies, and make them harder to reduce. Hi Jani, Good point, I'll move the implementation to drm_crtc.c. Thanks, Jessica Zhang > > BR, > Jani. > >> >> #endif /* __DRM_CRTC_H__ */ > > -- > Jani Nikula, Intel
On 9/25/2024 2:11 PM, Dmitry Baryshkov wrote: > On Wed, 25 Sept 2024 at 22:39, Jessica Zhang <quic_jesszhan@quicinc.com> wrote: >> >> >> >> On 9/24/2024 4:13 PM, Dmitry Baryshkov wrote: >>> On Tue, Sep 24, 2024 at 03:59:21PM GMT, Jessica Zhang wrote: >>>> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> >>>> >>>> All resource allocation is centered around the LMs. Then other blocks >>>> (except DSCs) are allocated basing on the LMs that was selected, and LM >>>> powers up the CRTC rather than the encoder. >>>> >>>> Moreover if at some point the driver supports encoder cloning, >>>> allocating resources from the encoder will be incorrect, as all clones >>>> will have different encoder IDs, while LMs are to be shared by these >>>> encoders. >>>> >>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> >>>> [quic_abhinavk@quicinc.com: Refactored resource allocation for CDM] >>>> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> >>>> [quic_jesszhan@quicinc.com: Changed to grabbing exising global state] >>>> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> >>>> --- >>>> drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 86 ++++++++++++ >>>> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 201 +++++++++++----------------- >>>> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h | 19 +++ >>>> 3 files changed, 183 insertions(+), 123 deletions(-) >>>> >>>> @@ -544,159 +542,117 @@ void dpu_encoder_helper_split_config( >>>> } >>>> } >>>> >>>> -bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc) >>>> +void dpu_encoder_update_topology(struct drm_encoder *drm_enc, >>>> + struct msm_display_topology *topology, >>>> + struct drm_atomic_state *state, >>>> + const struct drm_display_mode *adj_mode) >>>> { >>>> struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc); >>>> - int i, intf_count = 0, num_dsc = 0; >>>> + struct drm_connector *connector; >>>> + struct drm_connector_state *conn_state; >>>> + struct msm_display_info *disp_info; >>>> + struct drm_framebuffer *fb; >>>> + struct msm_drm_private *priv; >>>> + int i; >>>> >>>> for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++) >>>> if (dpu_enc->phys_encs[i]) >>>> - intf_count++; >>>> + topology->num_intf++; >>>> >>>> - /* See dpu_encoder_get_topology, we only support 2:2:1 topology */ >>>> + /* We only support 2 DSC mode (with 2 LM and 1 INTF) */ >>>> if (dpu_enc->dsc) >>>> - num_dsc = 2; >>>> + topology->num_dsc += 2; >>>> >>>> - return (num_dsc > 0) && (num_dsc > intf_count); >>>> -} >>>> + connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc); >>>> + if (!connector) >>>> + return; >>>> + conn_state = drm_atomic_get_new_connector_state(state, connector); >>>> + if (!conn_state) >>>> + return; >>>> >>>> -struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc) >>>> -{ >>>> - struct msm_drm_private *priv = drm_enc->dev->dev_private; >>>> - struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc); >>>> - int index = dpu_enc->disp_info.h_tile_instance[0]; >>>> + disp_info = &dpu_enc->disp_info; >>>> >>>> - if (dpu_enc->disp_info.intf_type == INTF_DSI) >>>> - return msm_dsi_get_dsc_config(priv->dsi[index]); >>>> + priv = drm_enc->dev->dev_private; >>>> >>>> - return NULL; >>>> + /* >>>> + * Use CDM only for writeback or DP at the moment as other interfaces cannot handle it. >>>> + * If writeback itself cannot handle cdm for some reason it will fail in its atomic_check() >>>> + * earlier. >>>> + */ >>>> + if (disp_info->intf_type == INTF_WB && conn_state->writeback_job) { >>>> + fb = conn_state->writeback_job->fb; >>>> + >>>> + if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb))) >>>> + topology->needs_cdm = true; >>>> + } else if (disp_info->intf_type == INTF_DP) { >>>> + if (msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]], adj_mode)) >>>> + topology->needs_cdm = true; >>>> + } >>> >>> Just to note, the needs_cdm is not enough once you introduce CWB >>> support. E.g. DP/YUV420 + WB/YUV case requires two CDM blocks (which we >>> don't have), but this doesn't get reflected in the topology. >> >> Hi Dmitry, >> >> Ack. I can add something to make atomic_check fail if the input FB is >> YUV format and CWB is enabled. > > I'd prefer for this to be more natural rather than just checking for > the DP && DP_YUV420 && WB && WB_FMT_YUV. In the worst case, count CDM > requests and then in RM check them against the catalog. But I had a > more logical (although more intrusive) implementation on my mind: > > struct msm_display_topology { > struct { > u32 num_intf; > u32 num_wb; > u32 num_dsc; > bool needs_cdm; > } outputs[MAX_OUTPUTS]; > u32 num_lm; > }; > > WDYT? > the struct msm_display_topology was originally designed as a per-encoder struct (dpu_encoder_get_topology() indicates the same). Making this an array of outputs was not needed as there is expected to be one struct msm_display_topology for each virt encoder's requested topology and the blocks inside of it other than LM, are "encoder" hw blocks. needs_cdm was made a boolean instead of a num_cdm_count like other hardware blocks because till the most recent chipset, we have only one CDM block. Whenever we do have more CDM blocks why will introducing num_cdm to the topology struct not solve your problem rather than making it an array of outputs? Because then, RM will know that the request exceeds the max blocks. I think what you are trying to do now is make struct msm_display_topology's encoder parts per-encoder and rest like num_lm per "RM session". The thought is not wrong but at the same time seems a bit of an overkill because its mostly already like that. Apart from CDM for which I have no indication of another one getting added, rest of the blocks are already aligned towards a per-encoder model and not a "RM session" model. Even if we end up doing it this way, most of the model is kind of unused really because each encoder will request its own topology anyway, there is just no aggregation for CDM which at this point is not needed as there is no HW we are aware of needing this. I think the atomic_check validation is needed either way because if two encoders request cdm, we cannot do clone mode as there is only one cdm block today. Its just that we are not tracking num_cdm today due to reasons explained above but basically doing something like below seems right to me: if (enc_is_in_clone_mode && needs_cdm) return -ENOTSUPPORTED; When we add more cdm_blocks, we can drop this check and making needs_cdm a num_cdm will make it naturally fail. >> >> Thanks, >> >> Jessica Zhang >> >>> >>>> } >>>> >>> -- >>> With best wishes >>> Dmitry >> > >
On Wed, Sep 25, 2024 at 02:49:48PM GMT, Abhinav Kumar wrote: > On 9/25/2024 2:11 PM, Dmitry Baryshkov wrote: > > On Wed, 25 Sept 2024 at 22:39, Jessica Zhang <quic_jesszhan@quicinc.com> wrote: > > > On 9/24/2024 4:13 PM, Dmitry Baryshkov wrote: > > > > On Tue, Sep 24, 2024 at 03:59:21PM GMT, Jessica Zhang wrote: > > > > > From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > > > > > > > > > > All resource allocation is centered around the LMs. Then other blocks > > > > > (except DSCs) are allocated basing on the LMs that was selected, and LM > > > > > powers up the CRTC rather than the encoder. > > > > > > > > > > Moreover if at some point the driver supports encoder cloning, > > > > > allocating resources from the encoder will be incorrect, as all clones > > > > > will have different encoder IDs, while LMs are to be shared by these > > > > > encoders. > > > > > > > > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > > > > > [quic_abhinavk@quicinc.com: Refactored resource allocation for CDM] > > > > > Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> > > > > > [quic_jesszhan@quicinc.com: Changed to grabbing exising global state] > > > > > Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> > > > > > --- > > > > > drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 86 ++++++++++++ > > > > > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 201 +++++++++++----------------- > > > > > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h | 19 +++ > > > > > 3 files changed, 183 insertions(+), 123 deletions(-) > > > > > > > > > > @@ -544,159 +542,117 @@ void dpu_encoder_helper_split_config( > > > > > } > > > > > } > > > > > > > > > > -bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc) > > > > > +void dpu_encoder_update_topology(struct drm_encoder *drm_enc, > > > > > + struct msm_display_topology *topology, > > > > > + struct drm_atomic_state *state, > > > > > + const struct drm_display_mode *adj_mode) > > > > > { > > > > > struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc); > > > > > - int i, intf_count = 0, num_dsc = 0; > > > > > + struct drm_connector *connector; > > > > > + struct drm_connector_state *conn_state; > > > > > + struct msm_display_info *disp_info; > > > > > + struct drm_framebuffer *fb; > > > > > + struct msm_drm_private *priv; > > > > > + int i; > > > > > > > > > > for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++) > > > > > if (dpu_enc->phys_encs[i]) > > > > > - intf_count++; > > > > > + topology->num_intf++; > > > > > > > > > > - /* See dpu_encoder_get_topology, we only support 2:2:1 topology */ > > > > > + /* We only support 2 DSC mode (with 2 LM and 1 INTF) */ > > > > > if (dpu_enc->dsc) > > > > > - num_dsc = 2; > > > > > + topology->num_dsc += 2; > > > > > > > > > > - return (num_dsc > 0) && (num_dsc > intf_count); > > > > > -} > > > > > + connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc); > > > > > + if (!connector) > > > > > + return; > > > > > + conn_state = drm_atomic_get_new_connector_state(state, connector); > > > > > + if (!conn_state) > > > > > + return; > > > > > > > > > > -struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc) > > > > > -{ > > > > > - struct msm_drm_private *priv = drm_enc->dev->dev_private; > > > > > - struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc); > > > > > - int index = dpu_enc->disp_info.h_tile_instance[0]; > > > > > + disp_info = &dpu_enc->disp_info; > > > > > > > > > > - if (dpu_enc->disp_info.intf_type == INTF_DSI) > > > > > - return msm_dsi_get_dsc_config(priv->dsi[index]); > > > > > + priv = drm_enc->dev->dev_private; > > > > > > > > > > - return NULL; > > > > > + /* > > > > > + * Use CDM only for writeback or DP at the moment as other interfaces cannot handle it. > > > > > + * If writeback itself cannot handle cdm for some reason it will fail in its atomic_check() > > > > > + * earlier. > > > > > + */ > > > > > + if (disp_info->intf_type == INTF_WB && conn_state->writeback_job) { > > > > > + fb = conn_state->writeback_job->fb; > > > > > + > > > > > + if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb))) > > > > > + topology->needs_cdm = true; > > > > > + } else if (disp_info->intf_type == INTF_DP) { > > > > > + if (msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]], adj_mode)) > > > > > + topology->needs_cdm = true; > > > > > + } > > > > > > > > Just to note, the needs_cdm is not enough once you introduce CWB > > > > support. E.g. DP/YUV420 + WB/YUV case requires two CDM blocks (which we > > > > don't have), but this doesn't get reflected in the topology. > > > > > > Hi Dmitry, > > > > > > Ack. I can add something to make atomic_check fail if the input FB is > > > YUV format and CWB is enabled. > > > > I'd prefer for this to be more natural rather than just checking for > > the DP && DP_YUV420 && WB && WB_FMT_YUV. In the worst case, count CDM > > requests and then in RM check them against the catalog. But I had a > > more logical (although more intrusive) implementation on my mind: > > > > struct msm_display_topology { > > struct { > > u32 num_intf; > > u32 num_wb; > > u32 num_dsc; > > bool needs_cdm; > > } outputs[MAX_OUTPUTS]; > > u32 num_lm; > > }; > > > > WDYT? > > > > the struct msm_display_topology was originally designed as a per-encoder > struct (dpu_encoder_get_topology() indicates the same). Making this an array > of outputs was not needed as there is expected to be one struct > msm_display_topology for each virt encoder's requested topology and the > blocks inside of it other than LM, are "encoder" hw blocks. > > needs_cdm was made a boolean instead of a num_cdm_count like other hardware > blocks because till the most recent chipset, we have only one CDM block. > Whenever we do have more CDM blocks why will introducing num_cdm to the > topology struct not solve your problem rather than making it an array of > outputs? > > Because then, RM will know that the request exceeds the max blocks. > > I think what you are trying to do now is make struct msm_display_topology's > encoder parts per-encoder and rest like num_lm per "RM session". > > The thought is not wrong but at the same time seems a bit of an overkill > because its mostly already like that. Apart from CDM for which I have no > indication of another one getting added, rest of the blocks are already > aligned towards a per-encoder model and not a "RM session" model. But we should be leaning towards RM session. > > Even if we end up doing it this way, most of the model is kind of unused > really because each encoder will request its own topology anyway, there is > just no aggregation for CDM which at this point is not needed as there is no > HW we are aware of needing this. With the resource allocation shifted to the CRTC individual encoders do not request their own topology (as it is now a property of the full output pipeline, not just an encoder). Yes, CDM aggregation into num_cdm seems unnecessary as there is just one CDM block. > I think the atomic_check validation is needed either way because if two > encoders request cdm, we cannot do clone mode as there is only one cdm block > today. Its just that we are not tracking num_cdm today due to reasons > explained above but basically doing something like below seems right to me: > > if (enc_is_in_clone_mode && needs_cdm) > return -ENOTSUPPORTED; This check is incorrect in my opinion. The hardware should be able to support DP/YUV420 + WB/RGB and DP/RGB + WB/YUV combinations. Please correct me if I'm wrong. > When we add more cdm_blocks, we can drop this check and making needs_cdm a > num_cdm will make it naturally fail.