diff mbox series

[v4] mmc: renesas_sdhi: Fix rounding errors

Message ID 20220926171002.62352-1-biju.das.jz@bp.renesas.com
State New
Headers show
Series [v4] mmc: renesas_sdhi: Fix rounding errors | expand

Commit Message

Biju Das Sept. 26, 2022, 5:10 p.m. UTC
Due to clk rounding errors on RZ/G2L platforms, it selects a clock source
with a lower clock rate compared to a higher one.
For eg: The rounding error (533333333 Hz / 4 * 4 = 533333332 Hz < 5333333
33 Hz) selects a clk source of 400 MHz instead of 533.333333 MHz.

This patch fixes this issue by adding a margin of (1/1024) higher to
the clock rate.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---
v3->v4:
 * Added Tested-by tag from Wolfram.
 * Updated commit description and code comment with real example.
v2->v3:
 * Renamed the variable new_clock_margin->new_upper_limit in renesas_sdhi_clk_
   update()
 * Moved setting of new_upper_limit outside for loop.
 * Updated the comment section to mention the rounding errors and merged with
   existing comment out side the for loop.
 * Updated commit description. 
v1->v2:
 * Add a comment explaining why margin is needed and set it to
   that particular value.
---
 drivers/mmc/host/renesas_sdhi_core.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

Comments

Biju Das Sept. 28, 2022, 11:01 a.m. UTC | #1
Hi Geert,

> Subject: RE: [PATCH v4] mmc: renesas_sdhi: Fix rounding errors
> 
> Hi Geert,
> 
> Thanks for the testing.
> 
> > Subject: Re: [PATCH v4] mmc: renesas_sdhi: Fix rounding errors
> >
> > Hi Biju,
> >
> > On Mon, Sep 26, 2022 at 7:10 PM Biju Das
> <biju.das.jz@bp.renesas.com>
> > wrote:
> > > Due to clk rounding errors on RZ/G2L platforms, it selects a clock
> > > source with a lower clock rate compared to a higher one.
> > > For eg: The rounding error (533333333 Hz / 4 * 4 = 533333332 Hz <
> > > 5333333
> > > 33 Hz) selects a clk source of 400 MHz instead of 533.333333 MHz.
> > >
> > > This patch fixes this issue by adding a margin of (1/1024) higher
> to
> > > the clock rate.
> > >
> > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> > > Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > > ---
> > > v3->v4:
> > >  * Added Tested-by tag from Wolfram.
> > >  * Updated commit description and code comment with real example.
> >
> > Thanks for the update!
> >
> > Unfortunately this patch causes a change in the clock frequencies
> used
> > on R-Car M2-W:
> >
> >     -clk_summary:          sd0                 97500000
> >     +clk_summary:          sd0                 32500000
> >     -clk_summary:             sdhi0            97500000
> >     +clk_summary:             sdhi0            32500000
> >
> >     -clk_summary:             sd3              12786885
> >     +clk_summary:             sd3              12187500
> >     -clk_summary:                sdhi3         12786885
> >     +clk_summary:                sdhi3         12187500
> >
> >     -clk_summary:             sd2              12786885
> >     +clk_summary:             sd2              12187500
> >     -clk_summary:                sdhi2         12786885
> >     +clk_summary:                sdhi2         12187500
> 
> That is not good.

Other than this, it is better to test performance as well to
check any regression due to second fix in this patch.

Note:
After mounting, I use below command to check performance.
dd if=/dev/zero of=test oflag=direct bs=8M count=64

Cheers,
Biju

> 
> >
> > > --- a/drivers/mmc/host/renesas_sdhi_core.c
> > > +++ b/drivers/mmc/host/renesas_sdhi_core.c
> >
> > > @@ -153,10 +154,17 @@ static unsigned int
> > renesas_sdhi_clk_update(struct tmio_mmc_host *host,
> > >          * greater than, new_clock.  As we can divide by 1 << i
> for
> > >          * any i in [0, 9] we want the input clock to be as close
> as
> > >          * possible, but no greater than, new_clock << i.
> > > +        *
> > > +        * Add an upper limit of 1/1024 rate higher to the clock
> > rate to fix
> > > +        * clk rate jumping to lower rate due to rounding error
> (eg:
> > RZ/G2L has
> > > +        * 3 clk sources 533.333333 MHz, 400 MHz and 266.666666
> MHz.
> > The request
> > > +        * for 533.333333 MHz will selects a slower 400 MHz due to
> > rounding
> > > +        * error (533333333 Hz / 4 * 4 = 533333332 Hz < 533333333
> > Hz)).
> > >          */
> > > +       new_upper_limit = (new_clock << i) + ((new_clock << i) >>
> > 10);
> >
> > Mea culpa: while new_clock is a constant inside the loop, i is not!
> > So it should be moved back inside the loop below.
> > With that change, R-Car M2-W is happy again, and I noticed no
> > regression on R-Car H3 ES2.0.
> 
> OK.
> 
> >
> > >         for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--)
> {
> > >                 freq = clk_round_rate(ref_clk, new_clock << i);
> > > -               if (freq > (new_clock << i)) {
> > > +               if (freq > new_upper_limit) {
> > >                         /* Too fast; look for a slightly slower
> > option */
> > >                         freq = clk_round_rate(ref_clk, (new_clock
> <<
> > i) / 4 * 3);
> > >                         if (freq > (new_clock << i))
> >                                      ^^^^^^^^^^^^^^^^ Probably this
> > should become new_upper_limit too, for consistency?
> > It doesn't seem to matter in my testing, though.
> 
> OK. Will do the below change in next version.
> 
> -	new_upper_limit = (new_clock << i) + ((new_clock << i) >> 10);
>  	for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) {
>  		freq = clk_round_rate(ref_clk, new_clock << i);
> +		new_upper_limit = (new_clock << i) + ((new_clock << i) >>
> 10);
>  		if (freq > new_upper_limit) {
>  			/* Too fast; look for a slightly slower option */
>  			freq = clk_round_rate(ref_clk, (new_clock << i) / 4 *
> 3);
> -			if (freq > (new_clock << i))
> +			if (freq > new_upper_limit)
>  				continue;
>  		}
> 
> Cheers,
> Biju
Biju Das Sept. 30, 2022, 8:04 a.m. UTC | #2
Hi kernel test robot,

>; Prabhakar Mahadev Lad
> <prabhakar.mahadev-lad.rj@bp.renesas.com>
> Subject: Re: [PATCH v4] mmc: renesas_sdhi: Fix rounding errors
> 
> Hi Biju,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
> >> drivers/mmc/host/renesas_sdhi_core.c:164:34: warning: variable 'i'
> is
> >> uninitialized when used here [-Wuninitialized]
>            new_upper_limit = (new_clock << i) + ((new_clock << i) >>
> 10);
>                                            ^
>    drivers/mmc/host/renesas_sdhi_core.c:132:7: note: initialize the
> variable 'i' to silence this warning
>            int i;
>                 ^
>                  = 0
>    1 warning generated.
> 
> 
> vim +/i +164 drivers/mmc/host/renesas_sdhi_core.c

I have sent v5. 
https://patchwork.kernel.org/project/linux-renesas-soc/patch/20220928110755.849275-1-biju.das.jz@bp.renesas.com/

Cheers,
Biju


> 
>    123
>    124	static unsigned int renesas_sdhi_clk_update(struct
> tmio_mmc_host *host,
>    125						    unsigned int wanted_clock)
>    126	{
>    127		struct renesas_sdhi *priv = host_to_priv(host);
>    128		struct clk *ref_clk = priv->clk;
>    129		unsigned int freq, diff, best_freq = 0, diff_min =
> ~0;
>    130		unsigned int new_clock, clkh_shift = 0;
>    131		unsigned int new_upper_limit;
>    132		int i;
>    133
>    134		/*
>    135		 * We simply return the current rate if a) we are not
> on a R-Car Gen2+
>    136		 * SoC (may work for others, but untested) or b) if
> the SCC needs its
>    137		 * clock during tuning, so we don't change the
> external clock setup.
>    138		 */
>    139		if (!(host->pdata->flags & TMIO_MMC_MIN_RCAR2) ||
> mmc_doing_tune(host->mmc))
>    140			return clk_get_rate(priv->clk);
>    141
>    142		if (priv->clkh) {
>    143			/* HS400 with 4TAP needs different clock
> settings */
>    144			bool use_4tap = priv->quirks && priv->quirks-
> >hs400_4taps;
>    145			bool need_slow_clkh = host->mmc->ios.timing ==
> MMC_TIMING_MMC_HS400;
>    146			clkh_shift = use_4tap && need_slow_clkh ? 1 :
> 2;
>    147			ref_clk = priv->clkh;
>    148		}
>    149
>    150		new_clock = wanted_clock << clkh_shift;
>    151
>    152		/*
>    153		 * We want the bus clock to be as close as possible
> to, but no
>    154		 * greater than, new_clock.  As we can divide by 1 <<
> i for
>    155		 * any i in [0, 9] we want the input clock to be as
> close as
>    156		 * possible, but no greater than, new_clock << i.
>    157		 *
>    158		 * Add an upper limit of 1/1024 rate higher to the
> clock rate to fix
>    159		 * clk rate jumping to lower rate due to rounding
> error (eg: RZ/G2L has
>    160		 * 3 clk sources 533.333333 MHz, 400 MHz and
> 266.666666 MHz. The request
>    161		 * for 533.333333 MHz will selects a slower 400 MHz
> due to rounding
>    162		 * error (533333333 Hz / 4 * 4 = 533333332 Hz <
> 533333333 Hz)).
>    163		 */
>  > 164		new_upper_limit = (new_clock << i) + ((new_clock <<
> i) >> 10);
>    165		for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0;
> i--) {
>    166			freq = clk_round_rate(ref_clk, new_clock << i);
>    167			if (freq > new_upper_limit) {
>    168				/* Too fast; look for a slightly slower
> option */
>    169				freq = clk_round_rate(ref_clk, (new_clock
> << i) / 4 * 3);
>    170				if (freq > (new_clock << i))
>    171					continue;
>    172			}
>    173
>    174			diff = new_clock - (freq >> i);
>    175			if (diff <= diff_min) {
>    176				best_freq = freq;
>    177				diff_min = diff;
>    178			}
>    179		}
>    180
>    181		clk_set_rate(ref_clk, best_freq);
>    182
>    183		if (priv->clkh)
>    184			clk_set_rate(priv->clk, best_freq >>
> clkh_shift);
>    185
>    186		return clk_get_rate(priv->clk);
>    187	}
>    188
> 
> --
> 0-DAY CI Kernel Test Service
diff mbox series

Patch

diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index 6edbf5c161ab..d1b8130ee37f 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -128,6 +128,7 @@  static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
 	struct clk *ref_clk = priv->clk;
 	unsigned int freq, diff, best_freq = 0, diff_min = ~0;
 	unsigned int new_clock, clkh_shift = 0;
+	unsigned int new_upper_limit;
 	int i;
 
 	/*
@@ -153,10 +154,17 @@  static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
 	 * greater than, new_clock.  As we can divide by 1 << i for
 	 * any i in [0, 9] we want the input clock to be as close as
 	 * possible, but no greater than, new_clock << i.
+	 *
+	 * Add an upper limit of 1/1024 rate higher to the clock rate to fix
+	 * clk rate jumping to lower rate due to rounding error (eg: RZ/G2L has
+	 * 3 clk sources 533.333333 MHz, 400 MHz and 266.666666 MHz. The request
+	 * for 533.333333 MHz will selects a slower 400 MHz due to rounding
+	 * error (533333333 Hz / 4 * 4 = 533333332 Hz < 533333333 Hz)).
 	 */
+	new_upper_limit = (new_clock << i) + ((new_clock << i) >> 10);
 	for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) {
 		freq = clk_round_rate(ref_clk, new_clock << i);
-		if (freq > (new_clock << i)) {
+		if (freq > new_upper_limit) {
 			/* Too fast; look for a slightly slower option */
 			freq = clk_round_rate(ref_clk, (new_clock << i) / 4 * 3);
 			if (freq > (new_clock << i))
@@ -181,6 +189,7 @@  static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
 static void renesas_sdhi_set_clock(struct tmio_mmc_host *host,
 				   unsigned int new_clock)
 {
+	unsigned int clk_margin;
 	u32 clk = 0, clock;
 
 	sd_ctrl_write16(host, CTL_SD_CARD_CLK_CTL, ~CLK_CTL_SCLKEN &
@@ -194,7 +203,13 @@  static void renesas_sdhi_set_clock(struct tmio_mmc_host *host,
 	host->mmc->actual_clock = renesas_sdhi_clk_update(host, new_clock);
 	clock = host->mmc->actual_clock / 512;
 
-	for (clk = 0x80000080; new_clock >= (clock << 1); clk >>= 1)
+	/*
+	 * Add a margin of 1/1024 rate higher to the clock rate in order
+	 * to avoid clk variable setting a value of 0 due to the margin
+	 * provided for actual_clock in renesas_sdhi_clk_update().
+	 */
+	clk_margin = new_clock >> 10;
+	for (clk = 0x80000080; new_clock + clk_margin >= (clock << 1); clk >>= 1)
 		clock <<= 1;
 
 	/* 1/1 clock is option */