Message ID | 20210611162000.2438023-1-anthony.l.nguyen@intel.com |
---|---|
Headers | show |
Series | 100GbE Intel Wired LAN Driver Updates 2021-06-11 | expand |
Hello: This series was applied to netdev/net-next.git (refs/heads/master): On Fri, 11 Jun 2021 09:19:52 -0700 you wrote: > Jake Keller says: > > Extend the ice driver to support basic PTP clock functionality for E810 > devices. > > This includes some tangential work required to setup the sideband queue and > driver shared parameters as well. > > [...] Here is the summary with links: - [net-next,1/8] ice: add support for sideband messages https://git.kernel.org/netdev/net-next/c/8f5ee3c477a8 - [net-next,2/8] ice: process 1588 PTP capabilities during initialization https://git.kernel.org/netdev/net-next/c/9733cc94c523 - [net-next,3/8] ice: add support for set/get of driver-stored firmware parameters https://git.kernel.org/netdev/net-next/c/7f9ab54d3144 - [net-next,4/8] ice: add low level PTP clock access functions https://git.kernel.org/netdev/net-next/c/03cb4473be92 - [net-next,5/8] ice: register 1588 PTP clock device object for E810 devices https://git.kernel.org/netdev/net-next/c/06c16d89d2cb - [net-next,6/8] ice: report the PTP clock index in ethtool .get_ts_info https://git.kernel.org/netdev/net-next/c/67569a7f9401 - [net-next,7/8] ice: enable receive hardware timestamping https://git.kernel.org/netdev/net-next/c/77a781155a65 - [net-next,8/8] ice: enable transmit timestamps for E810 devices https://git.kernel.org/netdev/net-next/c/ea9b847cda64 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
On 6/11/2021 2:18 PM, Jakub Kicinski wrote: > On Fri, 11 Jun 2021 09:19:57 -0700 Tony Nguyen wrote: >> +static u64 >> +ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts) >> +{ >> + struct ice_hw *hw = &pf->hw; >> + u32 hi, lo, lo2; >> + u8 tmr_idx; >> + >> + tmr_idx = ice_get_ptp_src_clock_index(hw); >> + /* Read the system timestamp pre PHC read */ >> + if (sts) >> + ptp_read_system_prets(sts); >> + >> + lo = rd32(hw, GLTSYN_TIME_L(tmr_idx)); >> + >> + /* Read the system timestamp post PHC read */ >> + if (sts) >> + ptp_read_system_postts(sts); >> + >> + hi = rd32(hw, GLTSYN_TIME_H(tmr_idx)); >> + lo2 = rd32(hw, GLTSYN_TIME_L(tmr_idx)); >> + >> + if (lo2 < lo) { >> + /* if TIME_L rolled over read TIME_L again and update >> + * system timestamps >> + */ >> + if (sts) >> + ptp_read_system_prets(sts); >> + lo = rd32(hw, GLTSYN_TIME_L(tmr_idx)); >> + if (sts) >> + ptp_read_system_postts(sts); > > ptp_read_system* helpers already check for NULL sts. > Hah. Yep, I knew that... and of course I forgot about it. > >> +static int ice_ptp_adjfine(struct ptp_clock_info *info, long scaled_ppm) >> +{ >> + struct ice_pf *pf = ptp_info_to_pf(info); >> + u64 freq, divisor = 1000000ULL; >> + struct ice_hw *hw = &pf->hw; >> + s64 incval, diff; >> + int neg_adj = 0; >> + int err; >> + >> + incval = ICE_PTP_NOMINAL_INCVAL_E810; >> + >> + if (scaled_ppm < 0) { >> + neg_adj = 1; >> + scaled_ppm = -scaled_ppm; >> + } >> + >> + while ((u64)scaled_ppm > div_u64(U64_MAX, incval)) { >> + /* handle overflow by scaling down the scaled_ppm and >> + * the divisor, losing some precision >> + */ >> + scaled_ppm >>= 2; >> + divisor >>= 2; >> + } > > I have a question regarding ppm overflows. > > We have the max_adj field in struct ptp_clock_info which is checked > against ppb, but ppb is a signed 32 bit and scaled_ppm is a long, > meaning values larger than S32_MAX << 16 / 1000 will overflow > the ppb calculation, and therefore the check. > Hmmm.. I thought ppb was a s64, not an s32. In general, I believe max_adj is usually capped at 1 billion anyways, since it doesn't make sense to slow a clock by more than 1billioln ppb, and increasing it more than that isn't really useful either. > Are we okay with that? Is my math off? Did I miss some part > of the kernel which filters crazy high scaled_ppm/freq? > > Since dialed_freq is updated regardless of return value of .adjfine > the driver has no clear way to reject bad scaled_ppm> I'm not sure. +Richard? >> + freq = (incval * (u64)scaled_ppm) >> 16; >> + diff = div_u64(freq, divisor);
On Mon, 14 Jun 2021 09:43:17 -0700 Jacob Keller wrote: > >> +static int ice_ptp_adjfine(struct ptp_clock_info *info, long scaled_ppm) > >> +{ > >> + struct ice_pf *pf = ptp_info_to_pf(info); > >> + u64 freq, divisor = 1000000ULL; > >> + struct ice_hw *hw = &pf->hw; > >> + s64 incval, diff; > >> + int neg_adj = 0; > >> + int err; > >> + > >> + incval = ICE_PTP_NOMINAL_INCVAL_E810; > >> + > >> + if (scaled_ppm < 0) { > >> + neg_adj = 1; > >> + scaled_ppm = -scaled_ppm; > >> + } > >> + > >> + while ((u64)scaled_ppm > div_u64(U64_MAX, incval)) { > >> + /* handle overflow by scaling down the scaled_ppm and > >> + * the divisor, losing some precision > >> + */ > >> + scaled_ppm >>= 2; > >> + divisor >>= 2; > >> + } > > > > I have a question regarding ppm overflows. > > > > We have the max_adj field in struct ptp_clock_info which is checked > > against ppb, but ppb is a signed 32 bit and scaled_ppm is a long, > > meaning values larger than S32_MAX << 16 / 1000 will overflow > > the ppb calculation, and therefore the check. > > Hmmm.. I thought ppb was a s64, not an s32. > > In general, I believe max_adj is usually capped at 1 billion anyways, > since it doesn't make sense to slow a clock by more than 1billioln ppb, > and increasing it more than that isn't really useful either. Do you mean it's capped somewhere in the code to 1B? I'm no time expert but this is not probability where 1 is a magic value, adjusting clock by 1 - 1ppb vs 1 + 1ppb makes little difference, no? Both mean something is super fishy with the nominal or expected frequency, but the hardware can do that and more. Flipping the question, if adjusting by large ppb values is not correct, why not cap the adjustment at the value which would prevent the u64 overflow? I don't really have a preferences here, I'm mostly disturbed by the overflow in the ppb vs max_adj check. > > Are we okay with that? Is my math off? Did I miss some part > > of the kernel which filters crazy high scaled_ppm/freq? > > > > Since dialed_freq is updated regardless of return value of .adjfine > > the driver has no clear way to reject bad scaled_ppm> > > I'm not sure. +Richard?
On Mon, Jun 14, 2021 at 09:43:17AM -0700, Jacob Keller wrote: > > Since dialed_freq is updated regardless of return value of .adjfine > > the driver has no clear way to reject bad scaled_ppm> > > I'm not sure. +Richard? The driver advertises "max_adj". The PHC layer checks user space inputs: ptp_clock.c line 140: } else if (tx->modes & ADJ_FREQUENCY) { s32 ppb = scaled_ppm_to_ppb(tx->freq); if (ppb > ops->max_adj || ppb < -ops->max_adj) return -ERANGE; So, if the max_adj is correct for the driver/HW, then all is well. Thanks, Richard
On Mon, 14 Jun 2021 11:12:20 -0700 Richard Cochran wrote: > On Mon, Jun 14, 2021 at 09:43:17AM -0700, Jacob Keller wrote: > > > Since dialed_freq is updated regardless of return value of .adjfine > > > the driver has no clear way to reject bad scaled_ppm> > > > > I'm not sure. +Richard? > > The driver advertises "max_adj". The PHC layer checks user space inputs: > > ptp_clock.c line 140: > } else if (tx->modes & ADJ_FREQUENCY) { > s32 ppb = scaled_ppm_to_ppb(tx->freq); > if (ppb > ops->max_adj || ppb < -ops->max_adj) > return -ERANGE; > > So, if the max_adj is correct for the driver/HW, then all is well. tx->freq is a long, and the conversion to ppb can overflow the s32 type. E.g. 281474976645 will become -2 AFAICT. I hacked up phc_ctl to not do range checking and kernel happily accepted that value. Shall we do this? --->8---- diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index 03a246e60fd9..ed32fc98d9d8 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -63,7 +63,7 @@ static void enqueue_external_timestamp(struct timestamp_event_queue *queue, spin_unlock_irqrestore(&queue->lock, flags); } -s32 scaled_ppm_to_ppb(long ppm) +s64 scaled_ppm_to_ppb(long ppm) { /* * The 'freq' field in the 'struct timex' is in parts per @@ -80,7 +80,7 @@ s32 scaled_ppm_to_ppb(long ppm) s64 ppb = 1 + ppm; ppb *= 125; ppb >>= 13; - return (s32) ppb; + return ppb; } EXPORT_SYMBOL(scaled_ppm_to_ppb); @@ -138,7 +138,7 @@ static int ptp_clock_adjtime(struct posix_clock *pc, struct __kernel_timex *tx) delta = ktime_to_ns(kt); err = ops->adjtime(ops, delta); } else if (tx->modes & ADJ_FREQUENCY) { - s32 ppb = scaled_ppm_to_ppb(tx->freq); + s64 ppb = scaled_ppm_to_ppb(tx->freq); if (ppb > ops->max_adj || ppb < -ops->max_adj) return -ERANGE; if (ops->adjfine)
> -----Original Message----- > From: Jakub Kicinski <kuba@kernel.org> > Sent: Monday, June 14, 2021 11:51 AM > To: Richard Cochran <richardcochran@gmail.com> > Cc: Keller, Jacob E <jacob.e.keller@intel.com>; Nguyen, Anthony L > <anthony.l.nguyen@intel.com>; davem@davemloft.net; > netdev@vger.kernel.org; sassmann@redhat.com; Brelinski, TonyX > <tonyx.brelinski@intel.com> > Subject: Re: [PATCH net-next 5/8] ice: register 1588 PTP clock device object for > E810 devices > > On Mon, 14 Jun 2021 11:12:20 -0700 Richard Cochran wrote: > > On Mon, Jun 14, 2021 at 09:43:17AM -0700, Jacob Keller wrote: > > > > Since dialed_freq is updated regardless of return value of .adjfine > > > > the driver has no clear way to reject bad scaled_ppm> > > > > > > I'm not sure. +Richard? > > > > The driver advertises "max_adj". The PHC layer checks user space inputs: > > > > ptp_clock.c line 140: > > } else if (tx->modes & ADJ_FREQUENCY) { > > s32 ppb = scaled_ppm_to_ppb(tx->freq); > > if (ppb > ops->max_adj || ppb < -ops->max_adj) > > return -ERANGE; > > > > So, if the max_adj is correct for the driver/HW, then all is well. > > tx->freq is a long, and the conversion to ppb can overflow the s32 type. > E.g. 281474976645 will become -2 AFAICT. I hacked up phc_ctl to not do > range checking and kernel happily accepted that value. Shall we do this? > > --->8---- > > diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c > index 03a246e60fd9..ed32fc98d9d8 100644 > --- a/drivers/ptp/ptp_clock.c > +++ b/drivers/ptp/ptp_clock.c > @@ -63,7 +63,7 @@ static void enqueue_external_timestamp(struct > timestamp_event_queue *queue, > spin_unlock_irqrestore(&queue->lock, flags); > } > > -s32 scaled_ppm_to_ppb(long ppm) > +s64 scaled_ppm_to_ppb(long ppm) > { > /* > * The 'freq' field in the 'struct timex' is in parts per > @@ -80,7 +80,7 @@ s32 scaled_ppm_to_ppb(long ppm) > s64 ppb = 1 + ppm; > ppb *= 125; > ppb >>= 13; > - return (s32) ppb; > + return ppb; > } > EXPORT_SYMBOL(scaled_ppm_to_ppb); > > @@ -138,7 +138,7 @@ static int ptp_clock_adjtime(struct posix_clock *pc, struct > __kernel_timex *tx) > delta = ktime_to_ns(kt); > err = ops->adjtime(ops, delta); > } else if (tx->modes & ADJ_FREQUENCY) { > - s32 ppb = scaled_ppm_to_ppb(tx->freq); > + s64 ppb = scaled_ppm_to_ppb(tx->freq); > if (ppb > ops->max_adj || ppb < -ops->max_adj) > return -ERANGE; > if (ops->adjfine) This looks correct to me.
> -----Original Message----- > From: Jakub Kicinski <kuba@kernel.org> > Sent: Monday, June 14, 2021 11:09 AM > To: Keller, Jacob E <jacob.e.keller@intel.com> > Cc: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Richard Cochran > <richardcochran@gmail.com>; davem@davemloft.net; netdev@vger.kernel.org; > sassmann@redhat.com; Brelinski, TonyX <tonyx.brelinski@intel.com> > Subject: Re: [PATCH net-next 5/8] ice: register 1588 PTP clock device object for > E810 devices > > On Mon, 14 Jun 2021 09:43:17 -0700 Jacob Keller wrote: > > >> +static int ice_ptp_adjfine(struct ptp_clock_info *info, long scaled_ppm) > > >> +{ > > >> + struct ice_pf *pf = ptp_info_to_pf(info); > > >> + u64 freq, divisor = 1000000ULL; > > >> + struct ice_hw *hw = &pf->hw; > > >> + s64 incval, diff; > > >> + int neg_adj = 0; > > >> + int err; > > >> + > > >> + incval = ICE_PTP_NOMINAL_INCVAL_E810; > > >> + > > >> + if (scaled_ppm < 0) { > > >> + neg_adj = 1; > > >> + scaled_ppm = -scaled_ppm; > > >> + } > > >> + > > >> + while ((u64)scaled_ppm > div_u64(U64_MAX, incval)) { > > >> + /* handle overflow by scaling down the scaled_ppm and > > >> + * the divisor, losing some precision > > >> + */ > > >> + scaled_ppm >>= 2; > > >> + divisor >>= 2; > > >> + } > > > > > > I have a question regarding ppm overflows. > > > > > > We have the max_adj field in struct ptp_clock_info which is checked > > > against ppb, but ppb is a signed 32 bit and scaled_ppm is a long, > > > meaning values larger than S32_MAX << 16 / 1000 will overflow > > > the ppb calculation, and therefore the check. > > > > Hmmm.. I thought ppb was a s64, not an s32. > > > > In general, I believe max_adj is usually capped at 1 billion anyways, > > since it doesn't make sense to slow a clock by more than 1billioln ppb, > > and increasing it more than that isn't really useful either. > > Do you mean it's capped somewhere in the code to 1B? > > I'm no time expert but this is not probability where 1 is a magic > value, adjusting clock by 1 - 1ppb vs 1 + 1ppb makes little difference, > no? Both mean something is super fishy with the nominal or expected > frequency, but the hardware can do that and more. > > Flipping the question, if adjusting by large ppb values is not correct, > why not cap the adjustment at the value which would prevent the u64 > overflow? Large ppb values are sometimes used when you want to slew a clock to bring it in sync when its a few milliseconds to seconds off, without performing a time jump (so that you maintain monotonic increasing time). That being said, we are supposed to be checking max_adj, except that you're right the conversion to ppb could overflow, and there's no check prior to the conversion from scaled_ppm to ppb. > > I don't really have a preferences here, I'm mostly disturbed by > the overflow in the ppb vs max_adj check. > > > > Are we okay with that? Is my math off? Did I miss some part > > > of the kernel which filters crazy high scaled_ppm/freq? > > > > > > Since dialed_freq is updated regardless of return value of .adjfine > > > the driver has no clear way to reject bad scaled_ppm> > > > > I'm not sure. +Richard?
On Mon, 14 Jun 2021 19:50:23 +0000 Keller, Jacob E wrote: > > > Hmmm.. I thought ppb was a s64, not an s32. > > > > > > In general, I believe max_adj is usually capped at 1 billion anyways, > > > since it doesn't make sense to slow a clock by more than 1billioln ppb, > > > and increasing it more than that isn't really useful either. > > > > Do you mean it's capped somewhere in the code to 1B? > > > > I'm no time expert but this is not probability where 1 is a magic > > value, adjusting clock by 1 - 1ppb vs 1 + 1ppb makes little difference, > > no? Both mean something is super fishy with the nominal or expected > > frequency, but the hardware can do that and more. > > > > Flipping the question, if adjusting by large ppb values is not correct, > > why not cap the adjustment at the value which would prevent the u64 > > overflow? > > Large ppb values are sometimes used when you want to slew a clock to > bring it in sync when its a few milliseconds to seconds off, without > performing a time jump (so that you maintain monotonic increasing > time). Ah, you're right, ptp4l will explicitly cap the freq adjustments based on max_adj from sysfs, so setting max_adj too low could impact the convergence time in strange scenarios. > That being said, we are supposed to be checking max_adj, except that > you're right the conversion to ppb could overflow, and there's no > check prior to the conversion from scaled_ppm to ppb.
On 6/14/2021 1:48 PM, Jakub Kicinski wrote: > On Mon, 14 Jun 2021 19:50:23 +0000 Keller, Jacob E wrote: >>>> Hmmm.. I thought ppb was a s64, not an s32. >>>> >>>> In general, I believe max_adj is usually capped at 1 billion anyways, >>>> since it doesn't make sense to slow a clock by more than 1billioln ppb, >>>> and increasing it more than that isn't really useful either. >>> >>> Do you mean it's capped somewhere in the code to 1B? >>> >>> I'm no time expert but this is not probability where 1 is a magic >>> value, adjusting clock by 1 - 1ppb vs 1 + 1ppb makes little difference, >>> no? Both mean something is super fishy with the nominal or expected >>> frequency, but the hardware can do that and more. >>> >>> Flipping the question, if adjusting by large ppb values is not correct, >>> why not cap the adjustment at the value which would prevent the u64 >>> overflow? >> >> Large ppb values are sometimes used when you want to slew a clock to >> bring it in sync when its a few milliseconds to seconds off, without >> performing a time jump (so that you maintain monotonic increasing >> time). > > Ah, you're right, ptp4l will explicitly cap the freq adjustments > based on max_adj from sysfs, so setting max_adj too low could impact > the convergence time in strange scenarios. > Your patch to fix it so that the conversion from scaled_ppm to ppb can't overflow is the correct approach, here. The scaled_ppm function didn't account for the fact that the provided adjustment could overflow the s32. Increasing that to s64 ensures it won't overflow and prevents invalid bogus frequencies from passing that check.
On Mon, 14 Jun 2021 13:51:57 -0700 Jacob Keller wrote: > > Ah, you're right, ptp4l will explicitly cap the freq adjustments > > based on max_adj from sysfs, so setting max_adj too low could impact > > the convergence time in strange scenarios. > > Your patch to fix it so that the conversion from scaled_ppm to ppb can't > overflow is the correct approach, here. The scaled_ppm function didn't > account for the fact that the provided adjustment could overflow the s32. > > Increasing that to s64 ensures it won't overflow and prevents invalid > bogus frequencies from passing that check. On second though I went with long for ppb IOW the same as scaled_ppm, presumably the problem does not exist on 32bit, so no point using 64b everywhere.
On Mon, Jun 14, 2021 at 11:50:43AM -0700, Jakub Kicinski wrote: > tx->freq is a long, and the conversion to ppb can overflow the s32 type. > E.g. 281474976645 will become -2 AFAICT. I hacked up phc_ctl to not do > range checking and kernel happily accepted that value. Shall we do this? Yes, you are right. The range check has a bug, and your fix is good. Thanks, Richard