diff mbox series

mtd: rawnand: Do not check for bad block if bbt is unavailable

Message ID 20210130035412.6456-1-manivannan.sadhasivam@linaro.org
State New
Headers show
Series mtd: rawnand: Do not check for bad block if bbt is unavailable | expand

Commit Message

Manivannan Sadhasivam Jan. 30, 2021, 3:54 a.m. UTC
The bbt pointer will be unavailable when NAND_SKIP_BBTSCAN option is
set for a NAND chip. The intention is to skip scanning for the bad
blocks during boot time. However, the MTD core will call
_block_isreserved() and _block_isbad() callbacks unconditionally for
the rawnand devices due to the callbacks always present while collecting
the ecc stats.

The _block_isreserved() callback for rawnand will bail out if bbt
pointer is not available. But _block_isbad() will continue without
checking for it. So this contradicts with the NAND_SKIP_BBTSCAN option
since the bad block check will happen anyways (ie., not much difference
between scanning for bad blocks and checking each block for bad ones).

Hence, do not check for the bad block if bbt pointer is unavailable.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
 drivers/mtd/nand/raw/nand_base.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Miquel Raynal Feb. 1, 2021, 2:18 p.m. UTC | #1
Hi Manivannan,

Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Sat,
30 Jan 2021 09:24:12 +0530:

> The bbt pointer will be unavailable when NAND_SKIP_BBTSCAN option is

> set for a NAND chip. The intention is to skip scanning for the bad

> blocks during boot time.


I don't have the same understanding: this flag skips the bad block
table scan, not the bad block scan. We do want to scan all the devices
in order to construct a RAM based table.

> However, the MTD core will call

> _block_isreserved() and _block_isbad() callbacks unconditionally for

> the rawnand devices due to the callbacks always present while collecting

> the ecc stats.

> 

> The _block_isreserved() callback for rawnand will bail out if bbt

> pointer is not available. But _block_isbad() will continue without

> checking for it. So this contradicts with the NAND_SKIP_BBTSCAN option

> since the bad block check will happen anyways (ie., not much difference

> between scanning for bad blocks and checking each block for bad ones).

> 

> Hence, do not check for the bad block if bbt pointer is unavailable.


Not checking for bad blocks at all feels insane. I don't really get the
scope and goal of such change?

> 

> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>

> ---

>  drivers/mtd/nand/raw/nand_base.c | 3 +++

>  1 file changed, 3 insertions(+)

> 

> diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c

> index c33fa1b1847f..f18cd1db79a9 100644

> --- a/drivers/mtd/nand/raw/nand_base.c

> +++ b/drivers/mtd/nand/raw/nand_base.c

> @@ -4286,6 +4286,9 @@ static int nand_block_isbad(struct mtd_info *mtd, loff_t offs)

>  	int chipnr = (int)(offs >> chip->chip_shift);

>  	int ret;

>  

> +	if (!chip->bbt)

> +		return 0;

> +

>  	/* Select the NAND device */

>  	ret = nand_get_device(chip);

>  	if (ret)


Cheers,
Miquèl
Manivannan Sadhasivam Feb. 2, 2021, 4:16 a.m. UTC | #2
Hi,

On Mon, Feb 01, 2021 at 03:18:24PM +0100, Miquel Raynal wrote:
> Hi Manivannan,
> 
> Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Sat,
> 30 Jan 2021 09:24:12 +0530:
> 
> > The bbt pointer will be unavailable when NAND_SKIP_BBTSCAN option is
> > set for a NAND chip. The intention is to skip scanning for the bad
> > blocks during boot time.
> 
> I don't have the same understanding: this flag skips the bad block
> table scan, not the bad block scan. We do want to scan all the devices
> in order to construct a RAM based table.
> 
> > However, the MTD core will call
> > _block_isreserved() and _block_isbad() callbacks unconditionally for
> > the rawnand devices due to the callbacks always present while collecting
> > the ecc stats.
> > 
> > The _block_isreserved() callback for rawnand will bail out if bbt
> > pointer is not available. But _block_isbad() will continue without
> > checking for it. So this contradicts with the NAND_SKIP_BBTSCAN option
> > since the bad block check will happen anyways (ie., not much difference
> > between scanning for bad blocks and checking each block for bad ones).
> > 
> > Hence, do not check for the bad block if bbt pointer is unavailable.
> 
> Not checking for bad blocks at all feels insane. I don't really get the
> scope and goal of such change?
> 

The issue I encountered is, on the Telit FN980 device one of the
partition seems to be protected. So trying to read the bad blocks in
that partition makes the device to reboot during boot.

There seems to be no flag passed by the parser for this partition. So
the only way I could let the device to boot is to completely skip the
bad block check.

AFAIK, MTD core only supports checking for the reserved blocks to be
used for BBM and there is no way to check for a reserved partition like
this.

I agree that skipping bad block check is not a sane way but I don't know
any other way to handle this problem.

Thanks,
Mani

> > 
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > ---
> >  drivers/mtd/nand/raw/nand_base.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > index c33fa1b1847f..f18cd1db79a9 100644
> > --- a/drivers/mtd/nand/raw/nand_base.c
> > +++ b/drivers/mtd/nand/raw/nand_base.c
> > @@ -4286,6 +4286,9 @@ static int nand_block_isbad(struct mtd_info *mtd, loff_t offs)
> >  	int chipnr = (int)(offs >> chip->chip_shift);
> >  	int ret;
> >  
> > +	if (!chip->bbt)
> > +		return 0;
> > +
> >  	/* Select the NAND device */
> >  	ret = nand_get_device(chip);
> >  	if (ret)
> 
> Cheers,
> Miquèl
Manivannan Sadhasivam Feb. 3, 2021, 9:58 a.m. UTC | #3
Hi Miquel, 

On 2 February 2021 1:44:59 PM IST, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>Hi Manivannan,
>
>Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Tue,
>2 Feb 2021 09:46:14 +0530:
>
>> Hi,
>> 
>> On Mon, Feb 01, 2021 at 03:18:24PM +0100, Miquel Raynal wrote:
>> > Hi Manivannan,
>> > 
>> > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on
>Sat,
>> > 30 Jan 2021 09:24:12 +0530:
>> >   
>> > > The bbt pointer will be unavailable when NAND_SKIP_BBTSCAN option
>is
>> > > set for a NAND chip. The intention is to skip scanning for the
>bad
>> > > blocks during boot time.  
>> > 
>> > I don't have the same understanding: this flag skips the bad block
>> > table scan, not the bad block scan. We do want to scan all the
>devices
>> > in order to construct a RAM based table.
>> >   
>> > > However, the MTD core will call
>> > > _block_isreserved() and _block_isbad() callbacks unconditionally
>for
>> > > the rawnand devices due to the callbacks always present while
>collecting
>> > > the ecc stats.
>> > > 
>> > > The _block_isreserved() callback for rawnand will bail out if bbt
>> > > pointer is not available. But _block_isbad() will continue
>without
>> > > checking for it. So this contradicts with the NAND_SKIP_BBTSCAN
>option
>> > > since the bad block check will happen anyways (ie., not much
>difference
>> > > between scanning for bad blocks and checking each block for bad
>ones).
>> > > 
>> > > Hence, do not check for the bad block if bbt pointer is
>unavailable.  
>> > 
>> > Not checking for bad blocks at all feels insane. I don't really get
>the
>> > scope and goal of such change?
>> >   
>> 
>> The issue I encountered is, on the Telit FN980 device one of the
>> partition seems to be protected. So trying to read the bad blocks in
>> that partition makes the device to reboot during boot.
>
>o_O
>
>Reading a protected block makes the device to reboot?
>
>What is the exact device? Can you share the datasheet? Is this behavior
>expected? Because it seems really broken to me, a read should not
>trigger *anything* that bad.
>

I got more information from the vendor, Telit. The access to the 3rd partition is protected by Trustzone and any access in non privileged mode (where Linux kernel runs) causes kernel panic and the device reboots. 

>> There seems to be no flag passed by the parser for this partition. So
>> the only way I could let the device to boot is to completely skip the
>> bad block check.
>
>We do have a "lock" property which informs the host to first unlock the
>device, would this help? Is this locking reversible?
>
>> AFAIK, MTD core only supports checking for the reserved blocks to be
>> used for BBM and there is no way to check for a reserved partition
>like
>> this.
>
>It sounds like a chip specificity/bug, would it make sense to add a
>specific vendor implementation for that?
>

So looks like this is a vendor quirk but this case might arise in future for other platforms as well. 

Thanks, 
Mani

>> I agree that skipping bad block check is not a sane way but I don't
>know
>> any other way to handle this problem.
>> 
>> Thanks,
>> Mani
>> 
>
>Thanks,
>Miquèl
Miquel Raynal Feb. 3, 2021, 10:05 a.m. UTC | #4
Hi Manivannan,

Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Wed,
03 Feb 2021 15:28:20 +0530:

> Hi Miquel, 
> 
> On 2 February 2021 1:44:59 PM IST, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >Hi Manivannan,
> >
> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Tue,
> >2 Feb 2021 09:46:14 +0530:
> >  
> >> Hi,
> >> 
> >> On Mon, Feb 01, 2021 at 03:18:24PM +0100, Miquel Raynal wrote:  
> >> > Hi Manivannan,
> >> > 
> >> > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on  
> >Sat,  
> >> > 30 Jan 2021 09:24:12 +0530:
> >> >     
> >> > > The bbt pointer will be unavailable when NAND_SKIP_BBTSCAN option  
> >is  
> >> > > set for a NAND chip. The intention is to skip scanning for the  
> >bad  
> >> > > blocks during boot time.    
> >> > 
> >> > I don't have the same understanding: this flag skips the bad block
> >> > table scan, not the bad block scan. We do want to scan all the  
> >devices  
> >> > in order to construct a RAM based table.
> >> >     
> >> > > However, the MTD core will call
> >> > > _block_isreserved() and _block_isbad() callbacks unconditionally  
> >for  
> >> > > the rawnand devices due to the callbacks always present while  
> >collecting  
> >> > > the ecc stats.
> >> > > 
> >> > > The _block_isreserved() callback for rawnand will bail out if bbt
> >> > > pointer is not available. But _block_isbad() will continue  
> >without  
> >> > > checking for it. So this contradicts with the NAND_SKIP_BBTSCAN  
> >option  
> >> > > since the bad block check will happen anyways (ie., not much  
> >difference  
> >> > > between scanning for bad blocks and checking each block for bad  
> >ones).  
> >> > > 
> >> > > Hence, do not check for the bad block if bbt pointer is  
> >unavailable.    
> >> > 
> >> > Not checking for bad blocks at all feels insane. I don't really get  
> >the  
> >> > scope and goal of such change?
> >> >     
> >> 
> >> The issue I encountered is, on the Telit FN980 device one of the
> >> partition seems to be protected. So trying to read the bad blocks in
> >> that partition makes the device to reboot during boot.  
> >
> >o_O
> >
> >Reading a protected block makes the device to reboot?
> >
> >What is the exact device? Can you share the datasheet? Is this behavior
> >expected? Because it seems really broken to me, a read should not
> >trigger *anything* that bad.
> >  
> 
> I got more information from the vendor, Telit. The access to the 3rd partition is protected by Trustzone and any access in non privileged mode (where Linux kernel runs) causes kernel panic and the device reboots. 

Ok, so this is not a chip feature but more a host constraint.

In this case it would be a good idea to add a host DT property which
describes the zone to avoid accessing it. Something like:

	secure-area/secure-section = <start length>;
Manivannan Sadhasivam Feb. 3, 2021, 10:12 a.m. UTC | #5
Hi Miquel, 

On 3 February 2021 3:35:22 PM IST, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>Hi Manivannan,
>
>Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Wed,
>03 Feb 2021 15:28:20 +0530:
>
>> Hi Miquel, 
>> 
>> On 2 February 2021 1:44:59 PM IST, Miquel Raynal
><miquel.raynal@bootlin.com> wrote:
>> >Hi Manivannan,
>> >
>> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on
>Tue,
>> >2 Feb 2021 09:46:14 +0530:
>> >  
>> >> Hi,
>> >> 
>> >> On Mon, Feb 01, 2021 at 03:18:24PM +0100, Miquel Raynal wrote:  
>> >> > Hi Manivannan,
>> >> > 
>> >> > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote
>on  
>> >Sat,  
>> >> > 30 Jan 2021 09:24:12 +0530:
>> >> >     
>> >> > > The bbt pointer will be unavailable when NAND_SKIP_BBTSCAN
>option  
>> >is  
>> >> > > set for a NAND chip. The intention is to skip scanning for the
> 
>> >bad  
>> >> > > blocks during boot time.    
>> >> > 
>> >> > I don't have the same understanding: this flag skips the bad
>block
>> >> > table scan, not the bad block scan. We do want to scan all the  
>> >devices  
>> >> > in order to construct a RAM based table.
>> >> >     
>> >> > > However, the MTD core will call
>> >> > > _block_isreserved() and _block_isbad() callbacks
>unconditionally  
>> >for  
>> >> > > the rawnand devices due to the callbacks always present while 
>
>> >collecting  
>> >> > > the ecc stats.
>> >> > > 
>> >> > > The _block_isreserved() callback for rawnand will bail out if
>bbt
>> >> > > pointer is not available. But _block_isbad() will continue  
>> >without  
>> >> > > checking for it. So this contradicts with the
>NAND_SKIP_BBTSCAN  
>> >option  
>> >> > > since the bad block check will happen anyways (ie., not much  
>> >difference  
>> >> > > between scanning for bad blocks and checking each block for
>bad  
>> >ones).  
>> >> > > 
>> >> > > Hence, do not check for the bad block if bbt pointer is  
>> >unavailable.    
>> >> > 
>> >> > Not checking for bad blocks at all feels insane. I don't really
>get  
>> >the  
>> >> > scope and goal of such change?
>> >> >     
>> >> 
>> >> The issue I encountered is, on the Telit FN980 device one of the
>> >> partition seems to be protected. So trying to read the bad blocks
>in
>> >> that partition makes the device to reboot during boot.  
>> >
>> >o_O
>> >
>> >Reading a protected block makes the device to reboot?
>> >
>> >What is the exact device? Can you share the datasheet? Is this
>behavior
>> >expected? Because it seems really broken to me, a read should not
>> >trigger *anything* that bad.
>> >  
>> 
>> I got more information from the vendor, Telit. The access to the 3rd
>partition is protected by Trustzone and any access in non privileged
>mode (where Linux kernel runs) causes kernel panic and the device
>reboots. 
>
>Ok, so this is not a chip feature but more a host constraint.
>
>In this case it would be a good idea to add a host DT property which
>describes the zone to avoid accessing it. Something like:
>
>	secure-area/secure-section = <start length>;
>
>From the core perspective, we should parse this property early enough
>and return -EIO when trying to access this area.
>
>Does this solution sound reasonable to you?
>

This sounds good to me. I'll give it a go and share the patch soon. 

Thanks, 
Mani

>Thanks,
>Miquèl
Boris Brezillon Feb. 3, 2021, 10:19 a.m. UTC | #6
On Wed, 03 Feb 2021 15:42:02 +0530
Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:

> >> 
> >> I got more information from the vendor, Telit. The access to the 3rd  
> >partition is protected by Trustzone and any access in non privileged
> >mode (where Linux kernel runs) causes kernel panic and the device
> >reboots. 

Out of curiosity, is it a per-CS-line thing or is this section
protected on all CS?

> >
> >Ok, so this is not a chip feature but more a host constraint.
> >
> >In this case it would be a good idea to add a host DT property which
> >describes the zone to avoid accessing it. Something like:
> >
> >	secure-area/secure-section = <start length>;
> >
> >From the core perspective, we should parse this property early enough
> >and return -EIO when trying to access this area.

FWIW, I'm not sure making it part of the core is a good idea, at least
not until we have a different platform with a same needs. The
controller driver can parse it and return -EACCESS (or -EIO) when this
section is accessed.
Manivannan Sadhasivam Feb. 3, 2021, 10:52 a.m. UTC | #7
On 3 February 2021 3:49:14 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:
>On Wed, 03 Feb 2021 15:42:02 +0530
>Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
>
>> >> 
>> >> I got more information from the vendor, Telit. The access to the
>3rd  
>> >partition is protected by Trustzone and any access in non privileged
>> >mode (where Linux kernel runs) causes kernel panic and the device
>> >reboots. 
>
>Out of curiosity, is it a per-CS-line thing or is this section
>protected on all CS?
>

Sorry, I didn't get your question. 

>> >
>> >Ok, so this is not a chip feature but more a host constraint.
>> >
>> >In this case it would be a good idea to add a host DT property which
>> >describes the zone to avoid accessing it. Something like:
>> >
>> >	secure-area/secure-section = <start length>;
>> >
>> >From the core perspective, we should parse this property early
>enough
>> >and return -EIO when trying to access this area.
>
>FWIW, I'm not sure making it part of the core is a good idea, at least
>not until we have a different platform with a same needs. The
>controller driver can parse it and return -EACCESS (or -EIO) when this
>section is accessed.

Fine with me. 

Thanks, 
Mani
Boris Brezillon Feb. 3, 2021, 11:24 a.m. UTC | #8
On Wed, 03 Feb 2021 16:22:42 +0530
Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:

> On 3 February 2021 3:49:14 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:
> >On Wed, 03 Feb 2021 15:42:02 +0530
> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> >  
> >> >> 
> >> >> I got more information from the vendor, Telit. The access to the  
> >3rd    
> >> >partition is protected by Trustzone and any access in non privileged
> >> >mode (where Linux kernel runs) causes kernel panic and the device
> >> >reboots.   
> >
> >Out of curiosity, is it a per-CS-line thing or is this section
> >protected on all CS?
> >  
> 
> Sorry, I didn't get your question. 

The qcom controller can handle several chips, each connected through a
different CS (chip-select) line, right? I'm wondering if the firmware
running in secure mode has the ability to block access for a specific
CS line or if all CS lines have the same constraint. That will impact
the way you describe it in your DT (in one case the secure-region
property should be under the controller node, in the other case it
should be under the NAND chip node).
Manivannan Sadhasivam Feb. 3, 2021, 11:41 a.m. UTC | #9
On 3 February 2021 4:54:22 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:
>On Wed, 03 Feb 2021 16:22:42 +0530
>Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
>
>> On 3 February 2021 3:49:14 PM IST, Boris Brezillon
><boris.brezillon@collabora.com> wrote:
>> >On Wed, 03 Feb 2021 15:42:02 +0530
>> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
>> >  
>> >> >> 
>> >> >> I got more information from the vendor, Telit. The access to
>the  
>> >3rd    
>> >> >partition is protected by Trustzone and any access in non
>privileged
>> >> >mode (where Linux kernel runs) causes kernel panic and the device
>> >> >reboots.   
>> >
>> >Out of curiosity, is it a per-CS-line thing or is this section
>> >protected on all CS?
>> >  
>> 
>> Sorry, I didn't get your question. 
>
>The qcom controller can handle several chips, each connected through a
>different CS (chip-select) line, right? I'm wondering if the firmware
>running in secure mode has the ability to block access for a specific
>CS line or if all CS lines have the same constraint. That will impact
>the way you describe it in your DT (in one case the secure-region
>property should be under the controller node, in the other case it
>should be under the NAND chip node).

Right. I believe the implementation is common to all NAND chips so the property should be in the controller node. 

Thanks, 
Mani
Miquel Raynal Feb. 4, 2021, 8:13 a.m. UTC | #10
Hi Manivannan,

Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Wed,
03 Feb 2021 17:11:31 +0530:

> On 3 February 2021 4:54:22 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:
> >On Wed, 03 Feb 2021 16:22:42 +0530
> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> >  
> >> On 3 February 2021 3:49:14 PM IST, Boris Brezillon  
> ><boris.brezillon@collabora.com> wrote:  
> >> >On Wed, 03 Feb 2021 15:42:02 +0530
> >> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> >> >    
> >> >> >> 
> >> >> >> I got more information from the vendor, Telit. The access to  
> >the    
> >> >3rd      
> >> >> >partition is protected by Trustzone and any access in non  
> >privileged  
> >> >> >mode (where Linux kernel runs) causes kernel panic and the device
> >> >> >reboots.     
> >> >
> >> >Out of curiosity, is it a per-CS-line thing or is this section
> >> >protected on all CS?
> >> >    
> >> 
> >> Sorry, I didn't get your question.   
> >
> >The qcom controller can handle several chips, each connected through a
> >different CS (chip-select) line, right? I'm wondering if the firmware
> >running in secure mode has the ability to block access for a specific
> >CS line or if all CS lines have the same constraint. That will impact
> >the way you describe it in your DT (in one case the secure-region
> >property should be under the controller node, in the other case it
> >should be under the NAND chip node).  
> 
> Right. I believe the implementation is common to all NAND chips so the property should be in the controller node. 

Looks weird: do you mean that each of the chips will have a secure area?
Manivannan Sadhasivam Feb. 4, 2021, 8:52 a.m. UTC | #11
On Thu, Feb 04, 2021 at 09:13:36AM +0100, Miquel Raynal wrote:
> Hi Manivannan,

> 

> Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Wed,

> 03 Feb 2021 17:11:31 +0530:

> 

> > On 3 February 2021 4:54:22 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:

> > >On Wed, 03 Feb 2021 16:22:42 +0530

> > >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:

> > >  

> > >> On 3 February 2021 3:49:14 PM IST, Boris Brezillon  

> > ><boris.brezillon@collabora.com> wrote:  

> > >> >On Wed, 03 Feb 2021 15:42:02 +0530

> > >> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:

> > >> >    

> > >> >> >> 

> > >> >> >> I got more information from the vendor, Telit. The access to  

> > >the    

> > >> >3rd      

> > >> >> >partition is protected by Trustzone and any access in non  

> > >privileged  

> > >> >> >mode (where Linux kernel runs) causes kernel panic and the device

> > >> >> >reboots.     

> > >> >

> > >> >Out of curiosity, is it a per-CS-line thing or is this section

> > >> >protected on all CS?

> > >> >    

> > >> 

> > >> Sorry, I didn't get your question.   

> > >

> > >The qcom controller can handle several chips, each connected through a

> > >different CS (chip-select) line, right? I'm wondering if the firmware

> > >running in secure mode has the ability to block access for a specific

> > >CS line or if all CS lines have the same constraint. That will impact

> > >the way you describe it in your DT (in one case the secure-region

> > >property should be under the controller node, in the other case it

> > >should be under the NAND chip node).  

> > 

> > Right. I believe the implementation is common to all NAND chips so the property should be in the controller node. 

> 

> Looks weird: do you mean that each of the chips will have a secure area?


I way I said is, the "secure-region" property will be present in the controller
node and not in the NAND chip node since this is not related to the device
functionality.

But for referencing the NAND device, the property can have the phandle as below:

secure-region = <&nand0 0xffff>;

Thanks,
Mani
Boris Brezillon Feb. 4, 2021, 8:59 a.m. UTC | #12
On Thu, 4 Feb 2021 14:22:21 +0530
Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:

> On Thu, Feb 04, 2021 at 09:13:36AM +0100, Miquel Raynal wrote:
> > Hi Manivannan,
> > 
> > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Wed,
> > 03 Feb 2021 17:11:31 +0530:
> >   
> > > On 3 February 2021 4:54:22 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:  
> > > >On Wed, 03 Feb 2021 16:22:42 +0530
> > > >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > >    
> > > >> On 3 February 2021 3:49:14 PM IST, Boris Brezillon    
> > > ><boris.brezillon@collabora.com> wrote:    
> > > >> >On Wed, 03 Feb 2021 15:42:02 +0530
> > > >> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > >> >      
> > > >> >> >> 
> > > >> >> >> I got more information from the vendor, Telit. The access to    
> > > >the      
> > > >> >3rd        
> > > >> >> >partition is protected by Trustzone and any access in non    
> > > >privileged    
> > > >> >> >mode (where Linux kernel runs) causes kernel panic and the device
> > > >> >> >reboots.       
> > > >> >
> > > >> >Out of curiosity, is it a per-CS-line thing or is this section
> > > >> >protected on all CS?
> > > >> >      
> > > >> 
> > > >> Sorry, I didn't get your question.     
> > > >
> > > >The qcom controller can handle several chips, each connected through a
> > > >different CS (chip-select) line, right? I'm wondering if the firmware
> > > >running in secure mode has the ability to block access for a specific
> > > >CS line or if all CS lines have the same constraint. That will impact
> > > >the way you describe it in your DT (in one case the secure-region
> > > >property should be under the controller node, in the other case it
> > > >should be under the NAND chip node).    
> > > 
> > > Right. I believe the implementation is common to all NAND chips so the property should be in the controller node.   
> > 
> > Looks weird: do you mean that each of the chips will have a secure area?  
> 
> I way I said is, the "secure-region" property will be present in the controller
> node and not in the NAND chip node since this is not related to the device
> functionality.
> 
> But for referencing the NAND device, the property can have the phandle as below:
> 
> secure-region = <&nand0 0xffff>;

My question was really what happens from a functional PoV. If you have
per-chip protection at the FW level, this property should be under the
NAND node. OTH, if the FW doesn't look at the selected chip before
blocking the access, it should be at the controller level. So, you
really have to understand what the secure FW does.
Miquel Raynal Feb. 4, 2021, 8:59 a.m. UTC | #13
Hi Manivannan,

Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Thu,
4 Feb 2021 14:22:21 +0530:

> On Thu, Feb 04, 2021 at 09:13:36AM +0100, Miquel Raynal wrote:
> > Hi Manivannan,
> > 
> > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Wed,
> > 03 Feb 2021 17:11:31 +0530:
> >   
> > > On 3 February 2021 4:54:22 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:  
> > > >On Wed, 03 Feb 2021 16:22:42 +0530
> > > >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > >    
> > > >> On 3 February 2021 3:49:14 PM IST, Boris Brezillon    
> > > ><boris.brezillon@collabora.com> wrote:    
> > > >> >On Wed, 03 Feb 2021 15:42:02 +0530
> > > >> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > >> >      
> > > >> >> >> 
> > > >> >> >> I got more information from the vendor, Telit. The access to    
> > > >the      
> > > >> >3rd        
> > > >> >> >partition is protected by Trustzone and any access in non    
> > > >privileged    
> > > >> >> >mode (where Linux kernel runs) causes kernel panic and the device
> > > >> >> >reboots.       
> > > >> >
> > > >> >Out of curiosity, is it a per-CS-line thing or is this section
> > > >> >protected on all CS?
> > > >> >      
> > > >> 
> > > >> Sorry, I didn't get your question.     
> > > >
> > > >The qcom controller can handle several chips, each connected through a
> > > >different CS (chip-select) line, right? I'm wondering if the firmware
> > > >running in secure mode has the ability to block access for a specific
> > > >CS line or if all CS lines have the same constraint. That will impact
> > > >the way you describe it in your DT (in one case the secure-region
> > > >property should be under the controller node, in the other case it
> > > >should be under the NAND chip node).    
> > > 
> > > Right. I believe the implementation is common to all NAND chips so the property should be in the controller node.   
> > 
> > Looks weird: do you mean that each of the chips will have a secure area?  
> 
> I way I said is, the "secure-region" property will be present in the controller
> node and not in the NAND chip node since this is not related to the device
> functionality.
> 
> But for referencing the NAND device, the property can have the phandle as below:
> 
> secure-region = <&nand0 0xffff>;

Probably more like:

secure-region = <&nand0 0x0 0xFFFF>; // of_node, start, size

but yeah, looks fine by me.

Thanks,
Miquèl
Miquel Raynal Feb. 4, 2021, 9:04 a.m. UTC | #14
Hi Boris,

Boris Brezillon <boris.brezillon@collabora.com> wrote on Thu, 4 Feb
2021 09:59:45 +0100:

> On Thu, 4 Feb 2021 14:22:21 +0530
> Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> 
> > On Thu, Feb 04, 2021 at 09:13:36AM +0100, Miquel Raynal wrote:  
> > > Hi Manivannan,
> > > 
> > > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Wed,
> > > 03 Feb 2021 17:11:31 +0530:
> > >     
> > > > On 3 February 2021 4:54:22 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:    
> > > > >On Wed, 03 Feb 2021 16:22:42 +0530
> > > > >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > > >      
> > > > >> On 3 February 2021 3:49:14 PM IST, Boris Brezillon      
> > > > ><boris.brezillon@collabora.com> wrote:      
> > > > >> >On Wed, 03 Feb 2021 15:42:02 +0530
> > > > >> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > > >> >        
> > > > >> >> >> 
> > > > >> >> >> I got more information from the vendor, Telit. The access to      
> > > > >the        
> > > > >> >3rd          
> > > > >> >> >partition is protected by Trustzone and any access in non      
> > > > >privileged      
> > > > >> >> >mode (where Linux kernel runs) causes kernel panic and the device
> > > > >> >> >reboots.         
> > > > >> >
> > > > >> >Out of curiosity, is it a per-CS-line thing or is this section
> > > > >> >protected on all CS?
> > > > >> >        
> > > > >> 
> > > > >> Sorry, I didn't get your question.       
> > > > >
> > > > >The qcom controller can handle several chips, each connected through a
> > > > >different CS (chip-select) line, right? I'm wondering if the firmware
> > > > >running in secure mode has the ability to block access for a specific
> > > > >CS line or if all CS lines have the same constraint. That will impact
> > > > >the way you describe it in your DT (in one case the secure-region
> > > > >property should be under the controller node, in the other case it
> > > > >should be under the NAND chip node).      
> > > > 
> > > > Right. I believe the implementation is common to all NAND chips so the property should be in the controller node.     
> > > 
> > > Looks weird: do you mean that each of the chips will have a secure area?    
> > 
> > I way I said is, the "secure-region" property will be present in the controller
> > node and not in the NAND chip node since this is not related to the device
> > functionality.
> > 
> > But for referencing the NAND device, the property can have the phandle as below:
> > 
> > secure-region = <&nand0 0xffff>;  
> 
> My question was really what happens from a functional PoV. If you have
> per-chip protection at the FW level, this property should be under the
> NAND node. OTH, if the FW doesn't look at the selected chip before
> blocking the access, it should be at the controller level. So, you
> really have to understand what the secure FW does.

I'm not so sure actually, that's why I like the phandle to nand0 -> in
any case it's not a property of the NAND chip itself, it's kind of a
host constraint, so I don't get why the property should be at the
NAND node level?

Also, we should probably support several secure regions (which could be
a way to express the fact that the FW does not look at the CS)?
Boris Brezillon Feb. 4, 2021, 9:27 a.m. UTC | #15
On Thu, 4 Feb 2021 10:04:08 +0100
Miquel Raynal <miquel.raynal@bootlin.com> wrote:

> Hi Boris,
> 
> Boris Brezillon <boris.brezillon@collabora.com> wrote on Thu, 4 Feb
> 2021 09:59:45 +0100:
> 
> > On Thu, 4 Feb 2021 14:22:21 +0530
> > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> >   
> > > On Thu, Feb 04, 2021 at 09:13:36AM +0100, Miquel Raynal wrote:    
> > > > Hi Manivannan,
> > > > 
> > > > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Wed,
> > > > 03 Feb 2021 17:11:31 +0530:
> > > >       
> > > > > On 3 February 2021 4:54:22 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:      
> > > > > >On Wed, 03 Feb 2021 16:22:42 +0530
> > > > > >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > > > >        
> > > > > >> On 3 February 2021 3:49:14 PM IST, Boris Brezillon        
> > > > > ><boris.brezillon@collabora.com> wrote:        
> > > > > >> >On Wed, 03 Feb 2021 15:42:02 +0530
> > > > > >> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > > > >> >          
> > > > > >> >> >> 
> > > > > >> >> >> I got more information from the vendor, Telit. The access to        
> > > > > >the          
> > > > > >> >3rd            
> > > > > >> >> >partition is protected by Trustzone and any access in non        
> > > > > >privileged        
> > > > > >> >> >mode (where Linux kernel runs) causes kernel panic and the device
> > > > > >> >> >reboots.           
> > > > > >> >
> > > > > >> >Out of curiosity, is it a per-CS-line thing or is this section
> > > > > >> >protected on all CS?
> > > > > >> >          
> > > > > >> 
> > > > > >> Sorry, I didn't get your question.         
> > > > > >
> > > > > >The qcom controller can handle several chips, each connected through a
> > > > > >different CS (chip-select) line, right? I'm wondering if the firmware
> > > > > >running in secure mode has the ability to block access for a specific
> > > > > >CS line or if all CS lines have the same constraint. That will impact
> > > > > >the way you describe it in your DT (in one case the secure-region
> > > > > >property should be under the controller node, in the other case it
> > > > > >should be under the NAND chip node).        
> > > > > 
> > > > > Right. I believe the implementation is common to all NAND chips so the property should be in the controller node.       
> > > > 
> > > > Looks weird: do you mean that each of the chips will have a secure area?      
> > > 
> > > I way I said is, the "secure-region" property will be present in the controller
> > > node and not in the NAND chip node since this is not related to the device
> > > functionality.
> > > 
> > > But for referencing the NAND device, the property can have the phandle as below:
> > > 
> > > secure-region = <&nand0 0xffff>;    
> > 
> > My question was really what happens from a functional PoV. If you have
> > per-chip protection at the FW level, this property should be under the
> > NAND node. OTH, if the FW doesn't look at the selected chip before
> > blocking the access, it should be at the controller level. So, you
> > really have to understand what the secure FW does.  
> 
> I'm not so sure actually, that's why I like the phandle to nand0 -> in
> any case it's not a property of the NAND chip itself, it's kind of a
> host constraint, so I don't get why the property should be at the
> NAND node level?

I would argue that we already have plenty of NAND properties that
encode things controlled by the host (ECC, partitions, HW randomizer,
boot device, and all kind of controller specific stuff) :P. Having
the props under the NAND node makes it clear what those things are
applied to, and it's also easier to parse for the driver (you already
have to parse each node to get the reg property anyway).

> 
> Also, we should probably support several secure regions (which could be
> a way to express the fact that the FW does not look at the CS)?

Sure, the secure-region should probably be renamed secure-regions, even
if it's defined at the NAND chip level.
Miquel Raynal Feb. 4, 2021, 9:31 a.m. UTC | #16
Hi Boris,

Boris Brezillon <boris.brezillon@collabora.com> wrote on Thu, 4 Feb
2021 10:27:38 +0100:

> On Thu, 4 Feb 2021 10:04:08 +0100
> Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> 
> > Hi Boris,
> > 
> > Boris Brezillon <boris.brezillon@collabora.com> wrote on Thu, 4 Feb
> > 2021 09:59:45 +0100:
> >   
> > > On Thu, 4 Feb 2021 14:22:21 +0530
> > > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > >     
> > > > On Thu, Feb 04, 2021 at 09:13:36AM +0100, Miquel Raynal wrote:      
> > > > > Hi Manivannan,
> > > > > 
> > > > > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote on Wed,
> > > > > 03 Feb 2021 17:11:31 +0530:
> > > > >         
> > > > > > On 3 February 2021 4:54:22 PM IST, Boris Brezillon <boris.brezillon@collabora.com> wrote:        
> > > > > > >On Wed, 03 Feb 2021 16:22:42 +0530
> > > > > > >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > > > > >          
> > > > > > >> On 3 February 2021 3:49:14 PM IST, Boris Brezillon          
> > > > > > ><boris.brezillon@collabora.com> wrote:          
> > > > > > >> >On Wed, 03 Feb 2021 15:42:02 +0530
> > > > > > >> >Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > > > > >> >            
> > > > > > >> >> >> 
> > > > > > >> >> >> I got more information from the vendor, Telit. The access to          
> > > > > > >the            
> > > > > > >> >3rd              
> > > > > > >> >> >partition is protected by Trustzone and any access in non          
> > > > > > >privileged          
> > > > > > >> >> >mode (where Linux kernel runs) causes kernel panic and the device
> > > > > > >> >> >reboots.             
> > > > > > >> >
> > > > > > >> >Out of curiosity, is it a per-CS-line thing or is this section
> > > > > > >> >protected on all CS?
> > > > > > >> >            
> > > > > > >> 
> > > > > > >> Sorry, I didn't get your question.           
> > > > > > >
> > > > > > >The qcom controller can handle several chips, each connected through a
> > > > > > >different CS (chip-select) line, right? I'm wondering if the firmware
> > > > > > >running in secure mode has the ability to block access for a specific
> > > > > > >CS line or if all CS lines have the same constraint. That will impact
> > > > > > >the way you describe it in your DT (in one case the secure-region
> > > > > > >property should be under the controller node, in the other case it
> > > > > > >should be under the NAND chip node).          
> > > > > > 
> > > > > > Right. I believe the implementation is common to all NAND chips so the property should be in the controller node.         
> > > > > 
> > > > > Looks weird: do you mean that each of the chips will have a secure area?        
> > > > 
> > > > I way I said is, the "secure-region" property will be present in the controller
> > > > node and not in the NAND chip node since this is not related to the device
> > > > functionality.
> > > > 
> > > > But for referencing the NAND device, the property can have the phandle as below:
> > > > 
> > > > secure-region = <&nand0 0xffff>;      
> > > 
> > > My question was really what happens from a functional PoV. If you have
> > > per-chip protection at the FW level, this property should be under the
> > > NAND node. OTH, if the FW doesn't look at the selected chip before
> > > blocking the access, it should be at the controller level. So, you
> > > really have to understand what the secure FW does.    
> > 
> > I'm not so sure actually, that's why I like the phandle to nand0 -> in
> > any case it's not a property of the NAND chip itself, it's kind of a
> > host constraint, so I don't get why the property should be at the
> > NAND node level?  
> 
> I would argue that we already have plenty of NAND properties that
> encode things controlled by the host (ECC, partitions, HW randomizer,
> boot device, and all kind of controller specific stuff) :P. Having
> the props under the NAND node makes it clear what those things are
> applied to, and it's also easier to parse for the driver (you already
> have to parse each node to get the reg property anyway).

Fair points.

> > Also, we should probably support several secure regions (which could be
> > a way to express the fact that the FW does not look at the CS)?  
> 
> Sure, the secure-region should probably be renamed secure-regions, even
> if it's defined at the NAND chip level.

Absolutely.

Thanks,
Miquèl
diff mbox series

Patch

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index c33fa1b1847f..f18cd1db79a9 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -4286,6 +4286,9 @@  static int nand_block_isbad(struct mtd_info *mtd, loff_t offs)
 	int chipnr = (int)(offs >> chip->chip_shift);
 	int ret;
 
+	if (!chip->bbt)
+		return 0;
+
 	/* Select the NAND device */
 	ret = nand_get_device(chip);
 	if (ret)