[v3,0/3] iommu/arm-smmu, drm/msm: Fixes for stall-on-fault

Message ID	20250122-msm-gpu-fault-fixes-next-v3-0-0afa00158521@gmail.com
Headers	show Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A5BA1607AC for <linux-arm-msm@vger.kernel.org>; Wed, 22 Jan 2025 20:01:09 +0000 (UTC) From: Connor Abbott <cwabbott0@gmail.com> Subject: [PATCH v3 0/3] iommu/arm-smmu, drm/msm: Fixes for stall-on-fault Date: Wed, 22 Jan 2025 15:00:57 -0500 Message-Id: <20250122-msm-gpu-fault-fixes-next-v3-0-0afa00158521@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit To: Rob Clark <robdclark@gmail.com>, Will Deacon <will@kernel.org>, Robin Murphy <robin.murphy@arm.com>, Joerg Roedel <joro@8bytes.org>, Sean Paul <sean@poorly.run>, Konrad Dybcio <konradybcio@kernel.org>, Abhinav Kumar <quic_abhinavk@quicinc.com>, Dmitry Baryshkov <dmitry.baryshkov@linaro.org>, Marijn Suijten <marijn.suijten@somainline.org> Cc: iommu@lists.linux.dev, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, freedreno@lists.freedesktop.org, Connor Abbott <cwabbott0@gmail.com>
Series	iommu/arm-smmu, drm/msm: Fixes for stall-on-fault \| expand [v3,0/3] iommu/arm-smmu, drm/msm: Fixes for stall-on-fault [v3,2/3] iommu/arm-smmu-qcom: Make set_stall work when the device is on

Connor Abbott Jan. 22, 2025, 8 p.m. UTC

drm/msm uses the stall-on-fault model to record the GPU state on the
first GPU page fault to help debugging. On systems where the GPU is
paired with a MMU-500, there were two problems:

1. The MMU-500 doesn't de-assert its interrupt line until the fault is
   resumed, which led to a storm of interrupts until the fault handler
   was called. If we got unlucky and the fault handler was on the same
   CPU as the interrupt, there was a deadlock.
2. The GPU is capable of generating page faults much faster than we can
   resume them. GMU (GPU Management Unit) shares the same context bank
   as the GPU, so if there was a sudden spurt of page faults it would be
   effectively starved and would trigger a watchdog reset, made even
   worse because the GPU cannot be reset while there's a pending
   transaction leaving the GPU permanently wedged.

Patch 1 fixes the first problem and is independent of the rest of the
series. Patch 3 fixes the second problem and is dependent on patch 2, so
there will have to be some cross-tree coordination.

I've rebased this series on the latest linux-next to avoid rebase
troubles.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
---
Changes in v3:
- Acknowledge the fault before resuming the transaction in patch 1.
- Add suggested extra context to commit messages.
- Link to v2: https://lore.kernel.org/r/20250120-msm-gpu-fault-fixes-next-v2-0-d636c4027042@gmail.com

Changes in v2:
- Remove unnecessary _irqsave when locking in IRQ handler (Robin)
- Reuse existing spinlock for CFIE manipulation (Robin)
- Lock CFCFG manipulation against concurrent CFIE manipulation
- Don't use timer to re-enable stall-on-fault. (Rob)
- Use more descriptive name for the function that re-enables
  stall-on-fault if the cooldown period has ended. (Rob)
- Link to v1: https://lore.kernel.org/r/20250117-msm-gpu-fault-fixes-next-v1-0-bc9b332b5d0b@gmail.com

---
Connor Abbott (3):
      iommu/arm-smmu: Fix spurious interrupts with stall-on-fault
      iommu/arm-smmu-qcom: Make set_stall work when the device is on
      drm/msm: Temporarily disable stall-on-fault after a page fault

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c      |  2 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c      |  4 +++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c    | 42 +++++++++++++++++++++++++++-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h    | 24 ++++++++++++++++
 drivers/gpu/drm/msm/msm_iommu.c            |  9 ++++++
 drivers/gpu/drm/msm/msm_mmu.h              |  1 +
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 45 +++++++++++++++++++++++++++---
 drivers/iommu/arm/arm-smmu/arm-smmu.c      | 41 ++++++++++++++++++++++++++-
 drivers/iommu/arm/arm-smmu/arm-smmu.h      |  1 -
 9 files changed, 162 insertions(+), 7 deletions(-)
---
base-commit: 0907e7fb35756464aa34c35d6abb02998418164b
change-id: 20250117-msm-gpu-fault-fixes-next-96e3098023e1

Best regards,

Prakash Gupta Jan. 23, 2025, 11:10 a.m. UTC | #1

On Wed, Jan 22, 2025 at 03:00:58PM -0500, Connor Abbott wrote:
 
> @@ -125,12 +125,25 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina
>  	struct arm_smmu_domain *smmu_domain = (void *)cookie;
>  	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>  	struct arm_smmu_device *smmu = smmu_domain->smmu;
> -	u32 reg = 0;
> +	u32 reg = 0, sctlr;
> +	unsigned long flags;
>  
>  	if (terminate)
>  		reg |= ARM_SMMU_RESUME_TERMINATE;
>  
> +	spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> +
>  	arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg);
> +
At this point further transaction can be processed but SCTLR.CFIE is
cleared so subequent context fault will not generate interrupt till
SCTLR.CFIE is set.

> +	/*
> +	 * Re-enable interrupts after they were disabled by
> +	 * arm_smmu_context_fault().
> +	 */
> +	sctlr = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_SCTLR);
> +	sctlr |= ARM_SMMU_SCTLR_CFIE;
> +	arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_SCTLR, sctlr);
> +
> +	spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
>  }
>  
>  static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set)
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 79afc92e1d8b984dd35c469a3f283ad0c78f3d26..ca1ff59015a63912f0f9c5256452b2b2efa928f1 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -463,13 +463,52 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>  	if (!(cfi.fsr & ARM_SMMU_CB_FSR_FAULT))
>  		return IRQ_NONE;
>  
> +	/*
> +	 * On some implementations FSR.SS asserts a context fault
> +	 * interrupt. We do not want this behavior, because resolving the
> +	 * original context fault typically requires operations that cannot be
> +	 * performed in IRQ context but leaving the stall unacknowledged will
> +	 * immediately lead to another spurious interrupt as FSR.SS is still
> +	 * set. Work around this by disabling interrupts for this context bank.
> +	 * It's expected that interrupts are re-enabled after resuming the
> +	 * translation.
> +	 *
> +	 * We have to do this before report_iommu_fault() so that we don't
> +	 * leave interrupts disabled in case the downstream user decides the
> +	 * fault can be resolved inside its fault handler.
> +	 *
> +	 * There is a possible race if there are multiple context banks sharing
> +	 * the same interrupt and both signal an interrupt in between writing
> +	 * RESUME and SCTLR. We could disable interrupts here before we
> +	 * re-enable them in the resume handler, leaving interrupts enabled.
> +	 * Lock the write to serialize it with the resume handler.
> +	 */
> +	if (cfi.fsr & ARM_SMMU_CB_FSR_SS) {
> +		u32 val;
> +
> +		spin_lock(&smmu_domain->cb_lock);
> +		val = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_SCTLR);
> +		val &= ~ARM_SMMU_SCTLR_CFIE;
> +		arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, val);
> +		spin_unlock(&smmu_domain->cb_lock);
> +	}
> +
> +	/*
> +	 * The SMMUv2 architecture specification says that if stall-on-fault is
> +	 * enabled the correct sequence is to write to SMMU_CBn_FSR to clear
> +	 * the fault and then write to SMMU_CBn_RESUME. Clear the interrupt
> +	 * first before running the user's fault handler to make sure we follow
> +	 * this sequence. It should be ok if there is another fault in the
> +	 * meantime because we have already read the fault info.
> +	 */
The context would remain stalled till we write to CBn_RESUME. Which is done
in qcom_adreno_smmu_resume_translation(). For a stalled context further
transactions are not processed and we shouldn't see further faults and
or fault inerrupts. Do you observe faults with stalled context?

> +	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr);
> +
>  	ret = report_iommu_fault(&smmu_domain->domain, NULL, cfi.iova,
>  		cfi.fsynr & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ);
>  
>  	if (ret == -ENOSYS && __ratelimit(&rs))
>  		arm_smmu_print_context_fault_info(smmu, idx, &cfi);
>  
> -	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr);
>  	return IRQ_HANDLED;
>  }
>  
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> index 2dbf3243b5ad2db01e17fb26c26c838942a491be..789c64ff3eb9944c8af37426e005241a8288da20 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> @@ -216,7 +216,6 @@ enum arm_smmu_cbar_type {
>  					 ARM_SMMU_CB_FSR_TLBLKF)
>  
>  #define ARM_SMMU_CB_FSR_FAULT		(ARM_SMMU_CB_FSR_MULTI |	\
> -					 ARM_SMMU_CB_FSR_SS |		\
Given writing to FSR.SS doesn't clear this bit but write to CBn_RESUME
does, this seems right. This but can be taken as separate patch.

>  					 ARM_SMMU_CB_FSR_UUT |		\
>  					 ARM_SMMU_CB_FSR_EF |		\
>  					 ARM_SMMU_CB_FSR_PF |		\
> 
> -- 
> 2.47.1
>

Robin Murphy Jan. 23, 2025, 11:51 a.m. UTC | #2

On 2025-01-23 11:10 am, Prakash Gupta wrote:
> On Wed, Jan 22, 2025 at 03:00:58PM -0500, Connor Abbott wrote:
>   
>> @@ -125,12 +125,25 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina
>>   	struct arm_smmu_domain *smmu_domain = (void *)cookie;
>>   	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>>   	struct arm_smmu_device *smmu = smmu_domain->smmu;
>> -	u32 reg = 0;
>> +	u32 reg = 0, sctlr;
>> +	unsigned long flags;
>>   
>>   	if (terminate)
>>   		reg |= ARM_SMMU_RESUME_TERMINATE;
>>   
>> +	spin_lock_irqsave(&smmu_domain->cb_lock, flags);
>> +
>>   	arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg);
>> +
> At this point further transaction can be processed but SCTLR.CFIE is
> cleared so subequent context fault will not generate interrupt till
> SCTLR.CFIE is set.
> 
>> +	/*
>> +	 * Re-enable interrupts after they were disabled by
>> +	 * arm_smmu_context_fault().
>> +	 */
>> +	sctlr = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_SCTLR);
>> +	sctlr |= ARM_SMMU_SCTLR_CFIE;
>> +	arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_SCTLR, sctlr);
>> +
>> +	spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
>>   }
>>   
>>   static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set)
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> index 79afc92e1d8b984dd35c469a3f283ad0c78f3d26..ca1ff59015a63912f0f9c5256452b2b2efa928f1 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> @@ -463,13 +463,52 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>>   	if (!(cfi.fsr & ARM_SMMU_CB_FSR_FAULT))
>>   		return IRQ_NONE;
>>   
>> +	/*
>> +	 * On some implementations FSR.SS asserts a context fault
>> +	 * interrupt. We do not want this behavior, because resolving the
>> +	 * original context fault typically requires operations that cannot be
>> +	 * performed in IRQ context but leaving the stall unacknowledged will
>> +	 * immediately lead to another spurious interrupt as FSR.SS is still
>> +	 * set. Work around this by disabling interrupts for this context bank.
>> +	 * It's expected that interrupts are re-enabled after resuming the
>> +	 * translation.
>> +	 *
>> +	 * We have to do this before report_iommu_fault() so that we don't
>> +	 * leave interrupts disabled in case the downstream user decides the
>> +	 * fault can be resolved inside its fault handler.
>> +	 *
>> +	 * There is a possible race if there are multiple context banks sharing
>> +	 * the same interrupt and both signal an interrupt in between writing
>> +	 * RESUME and SCTLR. We could disable interrupts here before we
>> +	 * re-enable them in the resume handler, leaving interrupts enabled.
>> +	 * Lock the write to serialize it with the resume handler.
>> +	 */
>> +	if (cfi.fsr & ARM_SMMU_CB_FSR_SS) {
>> +		u32 val;
>> +
>> +		spin_lock(&smmu_domain->cb_lock);
>> +		val = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_SCTLR);
>> +		val &= ~ARM_SMMU_SCTLR_CFIE;
>> +		arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, val);
>> +		spin_unlock(&smmu_domain->cb_lock);
>> +	}
>> +
>> +	/*
>> +	 * The SMMUv2 architecture specification says that if stall-on-fault is
>> +	 * enabled the correct sequence is to write to SMMU_CBn_FSR to clear
>> +	 * the fault and then write to SMMU_CBn_RESUME. Clear the interrupt
>> +	 * first before running the user's fault handler to make sure we follow
>> +	 * this sequence. It should be ok if there is another fault in the
>> +	 * meantime because we have already read the fault info.
>> +	 */
> The context would remain stalled till we write to CBn_RESUME. Which is done
> in qcom_adreno_smmu_resume_translation(). For a stalled context further
> transactions are not processed and we shouldn't see further faults and
> or fault inerrupts. Do you observe faults with stalled context?

This aspect isn't exclusive to stalled contexts though - even for 
"normal" terminated faults, clearing the FSR as soon as we've sampled 
all the associated fault registers is no bad thing, since if a second 
fault does occur while we're still reporting the first, we're then more 
likely to get a full syndrome rather than just the FSR.MULTI bit.

>> +	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr);
>> +
>>   	ret = report_iommu_fault(&smmu_domain->domain, NULL, cfi.iova,
>>   		cfi.fsynr & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ);
>>   
>>   	if (ret == -ENOSYS && __ratelimit(&rs))
>>   		arm_smmu_print_context_fault_info(smmu, idx, &cfi);
>>   
>> -	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr);
>>   	return IRQ_HANDLED;
>>   }
>>   
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> index 2dbf3243b5ad2db01e17fb26c26c838942a491be..789c64ff3eb9944c8af37426e005241a8288da20 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> @@ -216,7 +216,6 @@ enum arm_smmu_cbar_type {
>>   					 ARM_SMMU_CB_FSR_TLBLKF)
>>   
>>   #define ARM_SMMU_CB_FSR_FAULT		(ARM_SMMU_CB_FSR_MULTI |	\
>> -					 ARM_SMMU_CB_FSR_SS |		\
> Given writing to FSR.SS doesn't clear this bit but write to CBn_RESUME
> does, this seems right. This but can be taken as separate patch.

This change on its own isn't really useful - all that would achieve is 
that instead of constantly re-reporting the FSR.SS "fault", the 
interrupt goes unhandled and the IRQ core ends up disabling it 
permanently. If anything that's arguably worse, since the storm of 
context fault reports does at least give a fairly clear indication of 
what's gone wrong, rather than having to deduce the cause of an "irq n: 
nobody cared" message entirely by code inspection.

Thanks,
Robin.

> 
>>   					 ARM_SMMU_CB_FSR_UUT |		\
>>   					 ARM_SMMU_CB_FSR_EF |		\
>>   					 ARM_SMMU_CB_FSR_PF |		\
>>
>> -- 
>> 2.47.1
>>

Connor Abbott Jan. 23, 2025, 2 p.m. UTC | #3

On Thu, Jan 23, 2025 at 6:10 AM Prakash Gupta <quic_guptap@quicinc.com> wrote:
>
> On Wed, Jan 22, 2025 at 03:00:58PM -0500, Connor Abbott wrote:
>
> > @@ -125,12 +125,25 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina
> >       struct arm_smmu_domain *smmu_domain = (void *)cookie;
> >       struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> >       struct arm_smmu_device *smmu = smmu_domain->smmu;
> > -     u32 reg = 0;
> > +     u32 reg = 0, sctlr;
> > +     unsigned long flags;
> >
> >       if (terminate)
> >               reg |= ARM_SMMU_RESUME_TERMINATE;
> >
> > +     spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> > +
> >       arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg);
> > +
> At this point further transaction can be processed but SCTLR.CFIE is
> cleared so subequent context fault will not generate interrupt till
> SCTLR.CFIE is set.

If you're asking why the spin lock is there, it's because this isn't
true if there's another context bank, they share an interrupt line,
and it happens to fault around the same time. I haven't checked if
that's actually the case for Adreno, but in case this gets used by
other drivers and moved into common code I want it to be as robust as
possible. This is explained in the comment added to
arm_smmu_context_fault(). Also the next commit toggles CFCFG and we
want to serialize against that.

>
> > +     /*
> > +      * Re-enable interrupts after they were disabled by
> > +      * arm_smmu_context_fault().
> > +      */
> > +     sctlr = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_SCTLR);
> > +     sctlr |= ARM_SMMU_SCTLR_CFIE;
> > +     arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_SCTLR, sctlr);
> > +
> > +     spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
> >  }
> >
> >  static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set)
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > index 79afc92e1d8b984dd35c469a3f283ad0c78f3d26..ca1ff59015a63912f0f9c5256452b2b2efa928f1 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > @@ -463,13 +463,52 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
> >       if (!(cfi.fsr & ARM_SMMU_CB_FSR_FAULT))
> >               return IRQ_NONE;
> >
> > +     /*
> > +      * On some implementations FSR.SS asserts a context fault
> > +      * interrupt. We do not want this behavior, because resolving the
> > +      * original context fault typically requires operations that cannot be
> > +      * performed in IRQ context but leaving the stall unacknowledged will
> > +      * immediately lead to another spurious interrupt as FSR.SS is still
> > +      * set. Work around this by disabling interrupts for this context bank.
> > +      * It's expected that interrupts are re-enabled after resuming the
> > +      * translation.
> > +      *
> > +      * We have to do this before report_iommu_fault() so that we don't
> > +      * leave interrupts disabled in case the downstream user decides the
> > +      * fault can be resolved inside its fault handler.
> > +      *
> > +      * There is a possible race if there are multiple context banks sharing
> > +      * the same interrupt and both signal an interrupt in between writing
> > +      * RESUME and SCTLR. We could disable interrupts here before we
> > +      * re-enable them in the resume handler, leaving interrupts enabled.
> > +      * Lock the write to serialize it with the resume handler.
> > +      */
> > +     if (cfi.fsr & ARM_SMMU_CB_FSR_SS) {
> > +             u32 val;
> > +
> > +             spin_lock(&smmu_domain->cb_lock);
> > +             val = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_SCTLR);
> > +             val &= ~ARM_SMMU_SCTLR_CFIE;
> > +             arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, val);
> > +             spin_unlock(&smmu_domain->cb_lock);
> > +     }
> > +
> > +     /*
> > +      * The SMMUv2 architecture specification says that if stall-on-fault is
> > +      * enabled the correct sequence is to write to SMMU_CBn_FSR to clear
> > +      * the fault and then write to SMMU_CBn_RESUME. Clear the interrupt
> > +      * first before running the user's fault handler to make sure we follow
> > +      * this sequence. It should be ok if there is another fault in the
> > +      * meantime because we have already read the fault info.
> > +      */
> The context would remain stalled till we write to CBn_RESUME. Which is done
> in qcom_adreno_smmu_resume_translation(). For a stalled context further
> transactions are not processed and we shouldn't see further faults and
> or fault inerrupts. Do you observe faults with stalled context?

Yes. I've observed that on MMU-500 writing RESUME before the interrupt
has been cleared doesn't clear SS. This happened with v2 in the case
where there was already a devcoredump and drm/msm called
qcom_adreno_smmu_resume_translation() immediately from its fault
handler, and we'd get a storm of unhandled interrupts until it was
disabled. Given that the architecture spec says we're supposed to
clear the interrupt first this may have been an attempt to "help"
developers.

>
> > +     arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr);
> > +
> >       ret = report_iommu_fault(&smmu_domain->domain, NULL, cfi.iova,
> >               cfi.fsynr & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ);
> >
> >       if (ret == -ENOSYS && __ratelimit(&rs))
> >               arm_smmu_print_context_fault_info(smmu, idx, &cfi);
> >
> > -     arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr);
> >       return IRQ_HANDLED;
> >  }
> >
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > index 2dbf3243b5ad2db01e17fb26c26c838942a491be..789c64ff3eb9944c8af37426e005241a8288da20 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > @@ -216,7 +216,6 @@ enum arm_smmu_cbar_type {
> >                                        ARM_SMMU_CB_FSR_TLBLKF)
> >
> >  #define ARM_SMMU_CB_FSR_FAULT                (ARM_SMMU_CB_FSR_MULTI |        \
> > -                                      ARM_SMMU_CB_FSR_SS |           \
> Given writing to FSR.SS doesn't clear this bit but write to CBn_RESUME
> does, this seems right. This but can be taken as separate patch.
>
> >                                        ARM_SMMU_CB_FSR_UUT |          \
> >                                        ARM_SMMU_CB_FSR_EF |           \
> >                                        ARM_SMMU_CB_FSR_PF |           \
> >
> > --
> > 2.47.1
> >

Connor Abbott Jan. 23, 2025, 6:09 p.m. UTC | #4

On Thu, Jan 23, 2025 at 12:35 PM Prakash Gupta <quic_guptap@quicinc.com> wrote:
>
> On Thu, Jan 23, 2025 at 11:51:27AM +0000, Robin Murphy wrote:
> > On 2025-01-23 11:10 am, Prakash Gupta wrote:
> > > On Wed, Jan 22, 2025 at 03:00:58PM -0500, Connor Abbott wrote:
> > > > + /*
> > > > +  * The SMMUv2 architecture specification says that if stall-on-fault is
> > > > +  * enabled the correct sequence is to write to SMMU_CBn_FSR to clear
> > > > +  * the fault and then write to SMMU_CBn_RESUME. Clear the interrupt
> > > > +  * first before running the user's fault handler to make sure we follow
> > > > +  * this sequence. It should be ok if there is another fault in the
> > > > +  * meantime because we have already read the fault info.
> > > > +  */
> > > The context would remain stalled till we write to CBn_RESUME. Which is done
> > > in qcom_adreno_smmu_resume_translation(). For a stalled context further
> > > transactions are not processed and we shouldn't see further faults and
> > > or fault inerrupts. Do you observe faults with stalled context?
> >
> > This aspect isn't exclusive to stalled contexts though - even for "normal"
> > terminated faults, clearing the FSR as soon as we've sampled all the
> > associated fault registers is no bad thing, since if a second fault does
> > occur while we're still reporting the first, we're then more likely to get a
> > full syndrome rather than just the FSR.MULTI bit.
> >
> ARM SMMUv2 spec recommends, in case of reported fault sw should first
> correct the condition which casued the fault, I would interpret this as
> reporting fault to client using callback, and then write CBn_FSR and
> CBn_RESUME in this order. Even in case of reported fault where context is
> not stalled, the first step, IMO, I see no reason why should be any
> different.  I agree that delaying fault clearance can result in FSR.MULTI
> being set, but clearning fault before  prevent clients to use SCTLR.HUPCF
> on subsequent transactions while they take any debug action. The client
> should be reported fault in the same state it occured. Please refer
> qcom_smmu_context_fault() for this sequence.

It's not possible to implement it the way the spec describes because
correcting the condition which caused the fault cannot always be done
in the client's callback. Sometimes it has to be deferred to a handler
not in IRQ context. However we must clear FSR (except for the SS bit)
before leaving the fault handler because while the transaction is
pending we want to be able to distinguish between a subsequent fault
in another context bank (only FSR.SS will be set and we can ignore and
return IRQ_NONE) and this context bank if they share an interrupt
line. So, moving this up generally doesn't hurt and fixes the case
where the client does resume the transaction inside its handler. I
don't think there's really another way to implement this.

And I have no idea why you think this prevents clients from using
HUPCF, we already use HUPCF in drm/msm and it works fine with this
series.

>
> > > > + arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr);
> > > > +
> > > >           ret = report_iommu_fault(&smmu_domain->domain, NULL, cfi.iova,
> > > >                   cfi.fsynr & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ);
> > > >           if (ret == -ENOSYS && __ratelimit(&rs))
> > > >                   arm_smmu_print_context_fault_info(smmu, idx, &cfi);
> > > > - arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr);
> > > >           return IRQ_HANDLED;
> > > >   }
> > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > > index 2dbf3243b5ad2db01e17fb26c26c838942a491be..789c64ff3eb9944c8af37426e005241a8288da20 100644
> > > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > > @@ -216,7 +216,6 @@ enum arm_smmu_cbar_type {
> > > >                                            ARM_SMMU_CB_FSR_TLBLKF)
> > > >   #define ARM_SMMU_CB_FSR_FAULT           (ARM_SMMU_CB_FSR_MULTI |        \
> > > > -                                  ARM_SMMU_CB_FSR_SS |           \
> > > Given writing to FSR.SS doesn't clear this bit but write to CBn_RESUME
> > > does, this seems right. This but can be taken as separate patch.
> >
> > This change on its own isn't really useful - all that would achieve is that
> > instead of constantly re-reporting the FSR.SS "fault", the interrupt goes
> > unhandled and the IRQ core ends up disabling it permanently. If anything
> > that's arguably worse, since the storm of context fault reports does at
> > least give a fairly clear indication of what's gone wrong, rather than
> > having to deduce the cause of an "irq n: nobody cared" message entirely by
> > code inspection.
> >
> Does spec allow or do we see reported fault with just FSR.SS bit.

Yes, the spec allows it and we do see it in practice. After this patch
we may still see it in the case where multiple context banks share an
interrupt and the other bank faults, but the correct action is to
ignore it because we've disabled CFIE for this bank already. This is
why this hunk has to be in this patch.

> If answer
> is no then Keeping FSR_SS would be misleading. Here ARM_SMMU_CB_FSR_FAULT
> is used to clear fault bits or check valid faults. Also validity of this
> is not based on rest of the change.
>
> Thanks,
> Prakash
>
> > >
> > > >                                            ARM_SMMU_CB_FSR_UUT |          \
> > > >                                            ARM_SMMU_CB_FSR_EF |           \
> > > >                                            ARM_SMMU_CB_FSR_PF |           \
> > > >
> > > > --
> > > > 2.47.1
> > > >
> >

Connor Abbott Jan. 23, 2025, 8:14 p.m. UTC | #5

On Thu, Jan 23, 2025 at 2:26 PM Prakash Gupta <quic_guptap@quicinc.com> wrote:
>
> On Thu, Jan 23, 2025 at 09:00:17AM -0500, Connor Abbott wrote:
> > On Thu, Jan 23, 2025 at 6:10 AM Prakash Gupta <quic_guptap@quicinc.com> wrote:
> > >
> > > On Wed, Jan 22, 2025 at 03:00:58PM -0500, Connor Abbott wrote:
> > >
> > > > +     /*
> > > > +      * On some implementations FSR.SS asserts a context fault
> > > > +      * interrupt. We do not want this behavior, because resolving the
> > > > +      * original context fault typically requires operations that cannot be
> > > > +      * performed in IRQ context but leaving the stall unacknowledged will
> > > > +      * immediately lead to another spurious interrupt as FSR.SS is still
> > > > +      * set. Work around this by disabling interrupts for this context bank.
> > > > +      * It's expected that interrupts are re-enabled after resuming the
> > > > +      * translation.
> > > > +      *
> > > > +      * We have to do this before report_iommu_fault() so that we don't
> > > > +      * leave interrupts disabled in case the downstream user decides the
> > > > +      * fault can be resolved inside its fault handler.
> > > > +      *
> > > > +      * There is a possible race if there are multiple context banks sharing
> > > > +      * the same interrupt and both signal an interrupt in between writing
> > > > +      * RESUME and SCTLR. We could disable interrupts here before we
> Not sure if multiple context bank with shared interrupt supported for
> arm-smmu driver, but even if does than context fault handler they would
> operate on their respective context register space, so I don't see the race
> at context register update.

Let's say CB1 enables stall-on-fault. The sequence is something like this:

- CB0 faults, context fault handler for CB0 runs first
- resume handler writes RESUME for CB1
- CB1 faults on some other pending transaction
- context fault handler for CB1 runs due to the fault from CB0 on
shared interrupt line, discovers there is an additional fault because
we just wrote RESUME
- context fault handler for CB1 writes SCTLR disabling CFIE
- resume handler writes SCTLR enabling CFIE

At the end CFIE is incorrectly enabled while the second CB1 fault is
pending and we get an interrupt storm.

Realistically this is only going to happen if the resume handler gets
interrupted in between the two register writes, otherwise it will
probably win the race and write SCTLR before CB1 can run its context
fault handler. But technically we need the spinlock.

>
> > > > +      * re-enable them in the resume handler, leaving interrupts enabled.
> > > > +      * Lock the write to serialize it with the resume handler.
> > > > +      */
> > > > +     if (cfi.fsr & ARM_SMMU_CB_FSR_SS) {
> > > > +             u32 val;
> > > > +
> > > > +             spin_lock(&smmu_domain->cb_lock);
> > > > +             val = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_SCTLR);
> > > > +             val &= ~ARM_SMMU_SCTLR_CFIE;
> > > > +             arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, val);
> > > > +             spin_unlock(&smmu_domain->cb_lock);
> > > > +     }
> > > > +
> > > > +     /*
> > > > +      * The SMMUv2 architecture specification says that if stall-on-fault is
> > > > +      * enabled the correct sequence is to write to SMMU_CBn_FSR to clear
> > > > +      * the fault and then write to SMMU_CBn_RESUME. Clear the interrupt
> > > > +      * first before running the user's fault handler to make sure we follow
> > > > +      * this sequence. It should be ok if there is another fault in the
> > > > +      * meantime because we have already read the fault info.
> > > > +      */
> qcom_adreno_smmu_get_fault_info() reads the fault info as part of client
> fault hanlder. So it would not be ok to clear FSR before reporting the
> fault to client.
>
> > > The context would remain stalled till we write to CBn_RESUME. Which is done
> > > in qcom_adreno_smmu_resume_translation(). For a stalled context further
> > > transactions are not processed and we shouldn't see further faults and
> > > or fault inerrupts. Do you observe faults with stalled context?
> >
> > Yes. I've observed that on MMU-500 writing RESUME before the interrupt
> > has been cleared doesn't clear SS. This happened with v2 in the case
> > where there was already a devcoredump and drm/msm called
> > qcom_adreno_smmu_resume_translation() immediately from its fault
> > handler, and we'd get a storm of unhandled interrupts until it was
> > disabled. Given that the architecture spec says we're supposed to
> > clear the interrupt first this may have been an attempt to "help"
> > developers.
> >
>
> Just to clarify, present sequence is:
> 1. context fault with stall enable. FSR.SS set.
> 2. Report fault to client
> 3. resume/terminate stalled transaction
> 4. clear FSR
>
> At what point when you try #2->#3->#4 or #4->#2->#3 sequence, is FSR.SS
> cleared and interrupt storm is observed.

With #2->#3->#4 FSR.SS is *not* cleared and there is a subsequent
interrupt storm with only FSR.SS asserted. With #4->#2->#3 there is no
interrupt storm. From this we conclude that MMU-500 does not clear
FSR.SS unless #4 happens before #3.

> The way CFIE disable is helping
> with current patch indicates write FSR is unstalling context and subsequent
> transactions are faulted.

No, it does not indicate that. The interrupt storm happens even when
there is exactly one context fault, and when the interrupt storm
happens *only* FSR.SS is asserted. I've verified this with debug
prints. Once more, MMU-500 will assert an interrupt when only FSR.SS
is asserted. This has nothing to do with subsequent transactions.

> Do you stop getting interrupt storm after write
> RESUME.

Yes, as long as the write to RESUME happens after all other bits in
FSR are cleared.

> If you can mention your SCTLR configuration and FSR state in test
> sequence, it would be clearer.

SCTLR has both HUPCF and CFCFG enabled.

>
> An aside, If reducing delay between FSR and RESUME write helps then both
> can be done as part of qcom_adreno_smmu_resume_translation(). This will
> follow spec recommendation and also make sure fault registers are not
> cleared before reporting fault to client.
>
> Thanks,
> Prakash

[v3,0/3] iommu/arm-smmu, drm/msm: Fixes for stall-on-fault

Message

Comments