Message ID | alpine.DEB.2.21.2204240214430.9383@angie.orcam.me.uk |
---|---|
State | Accepted |
Commit | f0a6c68f69981214cb7858738dd2bc81475111f7 |
Headers | show |
Series | MIPS: Fix CP0 counter erratum detection for R4k CPUs | expand |
On Sun, Apr 24, 2022 at 12:46:23PM +0100, Maciej W. Rozycki wrote: > Fix the discrepancy between the two places we check for the CP0 counter > erratum in along with the incorrect comparison of the R4400 revision > number against 0x30 which matches none and consistently consider all > R4000 and R4400 processors affected, as documented in processor errata > publications[1][2][3], following the mapping between CP0 PRId register > values and processor models: > > PRId | Processor Model > ---------+-------------------- > 00000422 | R4000 Revision 2.2 > 00000430 | R4000 Revision 3.0 > 00000440 | R4400 Revision 1.0 > 00000450 | R4400 Revision 2.0 > 00000460 | R4400 Revision 3.0 interesting, where is this documented ? And it's quite funny that so far everybody messed up revision printing for R4400 CPUs. > Please review the requirements for SNI platforms. In the case of an > erratic CP0 timer we give precedence to the use as a clock event rather > than clock source device; see `time_init' in arch/mips/kernel/time.c. > Therefore if SNI systems have no alternative timer interrupt source, then > the CP0 timer is supposed to still do regardless of the erratum. > > Conversely a system can do without a high-precision clock source, in > which case jiffies will be used. Of course such a system will suffer if > used for precision timekeeping, but such is the price for broken hardware. > Don't SNI systems have any alternative timer available, not even the > venerable 8254? all SNI systems have a i8254 in their EISA/PCI chipsets. But they aren't that nice for clock events as their interupts are connected via an i8259 addresses via ISA PIO. > With the considerations above in mind, please apply. will do later. Thomas.
On Fri, 29 Apr 2022, Thomas Bogendoerfer wrote: > > Fix the discrepancy between the two places we check for the CP0 counter > > erratum in along with the incorrect comparison of the R4400 revision > > number against 0x30 which matches none and consistently consider all > > R4000 and R4400 processors affected, as documented in processor errata > > publications[1][2][3], following the mapping between CP0 PRId register > > values and processor models: > > > > PRId | Processor Model > > ---------+-------------------- > > 00000422 | R4000 Revision 2.2 > > 00000430 | R4000 Revision 3.0 > > 00000440 | R4400 Revision 1.0 > > 00000450 | R4400 Revision 2.0 > > 00000460 | R4400 Revision 3.0 > > interesting, where is this documented ? And it's quite funny that so far > everybody messed up revision printing for R4400 CPUs. That's just observation combined with past discussions with Ralf. Basically the PRId implementation number is 0x04 for both the R4000 and the R4400 (the only difference between the two CPUs is the addition of the write-back buffer in the latter one, making it weakly ordered). And then the PRId revision number matches exactly the documented CPU revision for the R4000, while for the R4400 you need to subtract 3 from the PRId revision number to get the documented CPU revision (i.e. what would be R4000 Revision 4.0 actually became R4400 Revision 1.0). At this time no old MIPSer from the SGI days may be around to confirm or contradict this observation. Documentation explicitly says[1]: "The revision number can distinguish some chip revisions, however there is no guarantee that changes to the chip will necessarily be reflected in the PRId register, or that changes to the revision number necessarily reflect real chip changes. For this reason, these values are not listed and software should not rely on the revision number in the PRId register to characterize the chip." but surely the author didn't have errata workarounds in mind plus all CPU revisions have already been manufactured so the mapping has been fixed. > > Conversely a system can do without a high-precision clock source, in > > which case jiffies will be used. Of course such a system will suffer if > > used for precision timekeeping, but such is the price for broken hardware. > > Don't SNI systems have any alternative timer available, not even the > > venerable 8254? > > all SNI systems have a i8254 in their EISA/PCI chipsets. But they aren't > that nice for clock events as their interupts are connected via an i8259 > addresses via ISA PIO. Interrupts are used for clock events, so you don't need one here. For clock sources you just read the counter. That said reading from the 8254 is messy too and you may need a spinlock (you need to write the Counter Latch or Read-Back command to the control register and then issue consecutive two reads to the requested timer's data register[2]). Which is I guess why support for it has been removed from x86 code. For non-SMP it might be good enough. > > With the considerations above in mind, please apply. > > will do later. Great, thanks! References: [1] Joe Heinrich: "MIPS R4000 Microprocessor User's Manual", Second Edition, MIPS Technologies, Inc., April 1, 1994, Chapter 4 "Memory Management", p.90 [2] "8254 Programmable Interval Timer", Intel Corporation, Order Number: 231164-005, September 1993, Section "Read Operations", pp.7-9 Maciej
On Mon, Apr 25, 2022 at 9:39 AM Maciej W. Rozycki <macro@orcam.me.uk> wrote: > > Fix the discrepancy between the two places we check for the CP0 counter > erratum in along with the incorrect comparison of the R4400 revision > number against 0x30 which matches none and consistently consider all > R4000 and R4400 processors affected, as documented in processor errata > publications[1][2][3], following the mapping between CP0 PRId register > values and processor models: > > PRId | Processor Model > ---------+-------------------- > 00000422 | R4000 Revision 2.2 > 00000430 | R4000 Revision 3.0 > 00000440 | R4400 Revision 1.0 > 00000450 | R4400 Revision 2.0 > 00000460 | R4400 Revision 3.0 > > No other revision of either processor has ever been spotted. > > Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume > R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the > CP0 counter as buggy does not preclude it from being used as either a > clock event or a clock source device. It just cannot be used as both at > a time, because in that case clock event interrupts will be occasionally > lost, and the use as a clock event device takes precedence. > > Compare against 0x4ff in `can_use_mips_counter' so that a single machine > instruction is produced. > > References: > > [1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS > Technologies Inc., May 10, 1994, Erratum 53, p.13 > > [2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies > Inc., February 9, 1994, Erratum 21, p.4 > > [3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS > Technologies Inc., January 24, 1995, Erratum 14, p.3 > > Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> > Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug") > Cc: stable@vger.kernel.org # v2.6.24+ > --- > arch/mips/include/asm/timex.h | 8 ++++---- > arch/mips/kernel/time.c | 11 +++-------- > 2 files changed, 7 insertions(+), 12 deletions(-) Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
On Sun, Apr 24, 2022 at 12:46:23PM +0100, Maciej W. Rozycki wrote: > Fix the discrepancy between the two places we check for the CP0 counter > erratum in along with the incorrect comparison of the R4400 revision > number against 0x30 which matches none and consistently consider all > R4000 and R4400 processors affected, as documented in processor errata > publications[1][2][3], following the mapping between CP0 PRId register > values and processor models: > > PRId | Processor Model > ---------+-------------------- > 00000422 | R4000 Revision 2.2 > 00000430 | R4000 Revision 3.0 > 00000440 | R4400 Revision 1.0 > 00000450 | R4400 Revision 2.0 > 00000460 | R4400 Revision 3.0 > > No other revision of either processor has ever been spotted. > > Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume > R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the > CP0 counter as buggy does not preclude it from being used as either a > clock event or a clock source device. It just cannot be used as both at > a time, because in that case clock event interrupts will be occasionally > lost, and the use as a clock event device takes precedence. > > Compare against 0x4ff in `can_use_mips_counter' so that a single machine > instruction is produced. > > References: > > [1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS > Technologies Inc., May 10, 1994, Erratum 53, p.13 > > [2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies > Inc., February 9, 1994, Erratum 21, p.4 > > [3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS > Technologies Inc., January 24, 1995, Erratum 14, p.3 > > Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> > Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug") > Cc: stable@vger.kernel.org # v2.6.24+ > --- > Thomas, > > Please review the requirements for SNI platforms. In the case of an > erratic CP0 timer we give precedence to the use as a clock event rather > than clock source device; see `time_init' in arch/mips/kernel/time.c. > Therefore if SNI systems have no alternative timer interrupt source, then > the CP0 timer is supposed to still do regardless of the erratum. > > Conversely a system can do without a high-precision clock source, in > which case jiffies will be used. Of course such a system will suffer if > used for precision timekeeping, but such is the price for broken hardware. > Don't SNI systems have any alternative timer available, not even the > venerable 8254? > > Long-term I think this code should be factored out and rewritten so that > it lives in one place and can take advantage of compile-time constants, > which will be the case for the majority of platforms. For the time being > fix the immediate breakage however. > > With the considerations above in mind, please apply. > > Maciej > --- > arch/mips/include/asm/timex.h | 8 ++++---- > arch/mips/kernel/time.c | 11 +++-------- > 2 files changed, 7 insertions(+), 12 deletions(-) applied to mips-fixes. Thomas.
> That said reading from the 8254 is messy too and you may need a spinlock > (you need to write the Counter Latch or Read-Back command to the control > register and then issue consecutive two reads to the requested timer's > data register[2]). Which is I guess why support for it has been removed > from x86 code. For non-SMP it might be good enough. It is important to 'latch' the counter before reading it. I tried to optimise some code to avoid the extra accesses. In principle it ought to have worked, but reading the unlatched values gave garbage - not just messed up 16bit values. It was probably returning a value that was being ripple-counted. The sheer number of IO cycles you need to read the counters just beggars belief. So while they can be used to get an accurate timestamp it really takes too long to be useful. Even in a modern x86 chipset I think they are still ISA speed cycles. (Mind you, we've an fpga based PCIe boards where reads are ISA speed....) OTOH I'd have though that for a real 486 (one without a TSC) that is the only way to get a high res timer count. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Index: linux-macro/arch/mips/include/asm/timex.h =================================================================== --- linux-macro.orig/arch/mips/include/asm/timex.h +++ linux-macro/arch/mips/include/asm/timex.h @@ -40,9 +40,9 @@ typedef unsigned int cycles_t; /* - * On R4000/R4400 before version 5.0 an erratum exists such that if the - * cycle counter is read in the exact moment that it is matching the - * compare register, no interrupt will be generated. + * On R4000/R4400 an erratum exists such that if the cycle counter is + * read in the exact moment that it is matching the compare register, + * no interrupt will be generated. * * There is a suggested workaround and also the erratum can't strike if * the compare interrupt isn't being used as the clock source device. @@ -63,7 +63,7 @@ static inline int can_use_mips_counter(u if (!__builtin_constant_p(cpu_has_counter)) asm volatile("" : "=m" (cpu_data[0].options)); if (likely(cpu_has_counter && - prid >= (PRID_IMP_R4000 | PRID_REV_ENCODE_44(5, 0)))) + prid > (PRID_IMP_R4000 | PRID_REV_ENCODE_44(15, 15)))) return 1; else return 0; Index: linux-macro/arch/mips/kernel/time.c =================================================================== --- linux-macro.orig/arch/mips/kernel/time.c +++ linux-macro/arch/mips/kernel/time.c @@ -141,15 +141,10 @@ static __init int cpu_has_mfc0_count_bug case CPU_R4400MC: /* * The published errata for the R4400 up to 3.0 say the CPU - * has the mfc0 from count bug. - */ - if ((current_cpu_data.processor_id & 0xff) <= 0x30) - return 1; - - /* - * we assume newer revisions are ok + * has the mfc0 from count bug. This seems the last version + * produced. */ - return 0; + return 1; } return 0;
Fix the discrepancy between the two places we check for the CP0 counter erratum in along with the incorrect comparison of the R4400 revision number against 0x30 which matches none and consistently consider all R4000 and R4400 processors affected, as documented in processor errata publications[1][2][3], following the mapping between CP0 PRId register values and processor models: PRId | Processor Model ---------+-------------------- 00000422 | R4000 Revision 2.2 00000430 | R4000 Revision 3.0 00000440 | R4400 Revision 1.0 00000450 | R4400 Revision 2.0 00000460 | R4400 Revision 3.0 No other revision of either processor has ever been spotted. Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the CP0 counter as buggy does not preclude it from being used as either a clock event or a clock source device. It just cannot be used as both at a time, because in that case clock event interrupts will be occasionally lost, and the use as a clock event device takes precedence. Compare against 0x4ff in `can_use_mips_counter' so that a single machine instruction is produced. References: [1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS Technologies Inc., May 10, 1994, Erratum 53, p.13 [2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies Inc., February 9, 1994, Erratum 21, p.4 [3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS Technologies Inc., January 24, 1995, Erratum 14, p.3 Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug") Cc: stable@vger.kernel.org # v2.6.24+ --- Thomas, Please review the requirements for SNI platforms. In the case of an erratic CP0 timer we give precedence to the use as a clock event rather than clock source device; see `time_init' in arch/mips/kernel/time.c. Therefore if SNI systems have no alternative timer interrupt source, then the CP0 timer is supposed to still do regardless of the erratum. Conversely a system can do without a high-precision clock source, in which case jiffies will be used. Of course such a system will suffer if used for precision timekeeping, but such is the price for broken hardware. Don't SNI systems have any alternative timer available, not even the venerable 8254? Long-term I think this code should be factored out and rewritten so that it lives in one place and can take advantage of compile-time constants, which will be the case for the majority of platforms. For the time being fix the immediate breakage however. With the considerations above in mind, please apply. Maciej --- arch/mips/include/asm/timex.h | 8 ++++---- arch/mips/kernel/time.c | 11 +++-------- 2 files changed, 7 insertions(+), 12 deletions(-) linux-mips-r4k-count-erratum-prid.diff