Message ID | 20200910161126.30948-1-oded.gabbay@gmail.com |
---|---|
Headers | show |
Series | Adding GAUDI NIC code to habanalabs driver | expand |
On Thu, Sep 10, 2020 at 11:01 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 10 Sep 2020 19:11:11 +0300 Oded Gabbay wrote: > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.c > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.h > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_dcbnl.c > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_debugfs.c > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_ethtool.c > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_phy.c > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_masks.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm1_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_masks.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc1_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxb_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_masks.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe1_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_stat_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_tmr_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_masks.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe1_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_masks.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs1_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm0_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm1_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm0_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm1_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm0_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm1_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm0_regs.h > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm1_regs.h > > create mode 100644 drivers/misc/habanalabs/include/hw_ip/nic/nic_general.h > > The relevant code needs to live under drivers/net/(ethernet/). > For one thing our automation won't trigger for drivers in random > (/misc) part of the tree. Can you please elaborate on how to do this with a single driver that is already in misc ? As I mentioned in the cover letter, we are not developing a stand-alone NIC. We have a deep-learning accelerator with a NIC interface. Therefore, we don't have a separate PCI physical function for the NIC and I can't have a second driver registering to it. We did this design based on existing examples in the kernel (registering to netdev), such as (just a few examples): 1. sgi-xp driver in drivers/misc/sgi-xp (see file xpnet.c) 2. bnx2fc in drivers/scsi/bnx2fc Thanks, Oded
On Thu, Sep 10, 2020 at 11:03 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 10 Sep 2020 19:11:16 +0300 Oded Gabbay wrote: > > +module_param(nic_rx_poll, int, 0444); > > +MODULE_PARM_DESC(nic_rx_poll, > > + "Enable NIC Rx polling mode (0 = no, 1 = yes, default no)"); > > If your chip does not support IRQ coalescing you can configure polling > and the timeout via ethtool -C, rather than a module parameter. Thanks for the pointer, we will take a look at that. Oded
> +static int gaudi_nic_get_link_ksettings(struct net_device *netdev, > + struct ethtool_link_ksettings *cmd) > +{ > + struct gaudi_nic_device **ptr = netdev_priv(netdev); > + struct gaudi_nic_device *gaudi_nic = *ptr; > + struct hl_device *hdev = gaudi_nic->hdev; > + u32 port = gaudi_nic->port, speed = gaudi_nic->speed; Please go through the code and fixup Reverse Christmas tree. > + > + cmd->base.speed = speed; > + cmd->base.duplex = DUPLEX_FULL; > + > + ethtool_link_ksettings_zero_link_mode(cmd, supported); > + ethtool_link_ksettings_zero_link_mode(cmd, advertising); > + > + ethtool_add_mode(cmd, supported, 100000baseCR4_Full); > + ethtool_add_mode(cmd, supported, 100000baseSR4_Full); > + ethtool_add_mode(cmd, supported, 100000baseKR4_Full); > + ethtool_add_mode(cmd, supported, 100000baseLR4_ER4_Full); > + > + ethtool_add_mode(cmd, supported, 50000baseSR2_Full); > + ethtool_add_mode(cmd, supported, 50000baseCR2_Full); > + ethtool_add_mode(cmd, supported, 50000baseKR2_Full); > + > + if (speed == SPEED_100000) { > + ethtool_add_mode(cmd, advertising, 100000baseCR4_Full); > + ethtool_add_mode(cmd, advertising, 100000baseSR4_Full); > + ethtool_add_mode(cmd, advertising, 100000baseKR4_Full); > + ethtool_add_mode(cmd, advertising, 100000baseLR4_ER4_Full); > + > + cmd->base.port = PORT_FIBRE; > + > + ethtool_add_mode(cmd, supported, FIBRE); > + ethtool_add_mode(cmd, advertising, FIBRE); > + > + ethtool_add_mode(cmd, supported, Backplane); > + ethtool_add_mode(cmd, advertising, Backplane); > + } else if (speed == SPEED_50000) { > + ethtool_add_mode(cmd, advertising, 50000baseSR2_Full); > + ethtool_add_mode(cmd, advertising, 50000baseCR2_Full); > + ethtool_add_mode(cmd, advertising, 50000baseKR2_Full); > + } else { > + dev_err(hdev->dev, "unknown speed %d, port %d\n", speed, port); > + return -EFAULT; > + } > + > + ethtool_add_mode(cmd, supported, Autoneg); > + > + if (gaudi_nic->auto_neg_enable) { > + ethtool_add_mode(cmd, advertising, Autoneg); > + cmd->base.autoneg = AUTONEG_ENABLE; > + if (gaudi_nic->auto_neg_resolved) > + ethtool_add_mode(cmd, lp_advertising, Autoneg); > + } else { > + cmd->base.autoneg = AUTONEG_DISABLE; > + } > + > + ethtool_add_mode(cmd, supported, Pause); > + > + if (gaudi_nic->pfc_enable) > + ethtool_add_mode(cmd, advertising, Pause); > + > + return 0; > +} > + > +static int gaudi_nic_set_link_ksettings(struct net_device *netdev, > + const struct ethtool_link_ksettings *cmd) > +{ > + struct gaudi_nic_device **ptr = netdev_priv(netdev); > + struct gaudi_nic_device *gaudi_nic = *ptr; > + struct hl_device *hdev = gaudi_nic->hdev; > + u32 port = gaudi_nic->port; > + int rc = 0, speed = cmd->base.speed; > + bool auto_neg = cmd->base.autoneg == AUTONEG_ENABLE; It appears you only support speed and auto_neg. You should check that all other things which could be configured are empty, e.g. none of the bits are set in cmd->link_modes.advertising. If you are requested to configure something which is not supported, you need to return -EOPNOTSUPP. Andrew
On Thu, Sep 10, 2020 at 11:19 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > +static int gaudi_nic_get_link_ksettings(struct net_device *netdev, > > + struct ethtool_link_ksettings *cmd) > > +{ > > + struct gaudi_nic_device **ptr = netdev_priv(netdev); > > + struct gaudi_nic_device *gaudi_nic = *ptr; > > + struct hl_device *hdev = gaudi_nic->hdev; > > + u32 port = gaudi_nic->port, speed = gaudi_nic->speed; > > Please go through the code and fixup Reverse Christmas tree. Of course, we will fix this. > > > + > > + cmd->base.speed = speed; > > + cmd->base.duplex = DUPLEX_FULL; > > + > > + ethtool_link_ksettings_zero_link_mode(cmd, supported); > > + ethtool_link_ksettings_zero_link_mode(cmd, advertising); > > + > > + ethtool_add_mode(cmd, supported, 100000baseCR4_Full); > > + ethtool_add_mode(cmd, supported, 100000baseSR4_Full); > > + ethtool_add_mode(cmd, supported, 100000baseKR4_Full); > > + ethtool_add_mode(cmd, supported, 100000baseLR4_ER4_Full); > > + > > + ethtool_add_mode(cmd, supported, 50000baseSR2_Full); > > + ethtool_add_mode(cmd, supported, 50000baseCR2_Full); > > + ethtool_add_mode(cmd, supported, 50000baseKR2_Full); > > + > > + if (speed == SPEED_100000) { > > + ethtool_add_mode(cmd, advertising, 100000baseCR4_Full); > > + ethtool_add_mode(cmd, advertising, 100000baseSR4_Full); > > + ethtool_add_mode(cmd, advertising, 100000baseKR4_Full); > > + ethtool_add_mode(cmd, advertising, 100000baseLR4_ER4_Full); > > + > > + cmd->base.port = PORT_FIBRE; > > + > > + ethtool_add_mode(cmd, supported, FIBRE); > > + ethtool_add_mode(cmd, advertising, FIBRE); > > + > > + ethtool_add_mode(cmd, supported, Backplane); > > + ethtool_add_mode(cmd, advertising, Backplane); > > + } else if (speed == SPEED_50000) { > > + ethtool_add_mode(cmd, advertising, 50000baseSR2_Full); > > + ethtool_add_mode(cmd, advertising, 50000baseCR2_Full); > > + ethtool_add_mode(cmd, advertising, 50000baseKR2_Full); > > + } else { > > + dev_err(hdev->dev, "unknown speed %d, port %d\n", speed, port); > > + return -EFAULT; > > + } > > + > > + ethtool_add_mode(cmd, supported, Autoneg); > > + > > + if (gaudi_nic->auto_neg_enable) { > > + ethtool_add_mode(cmd, advertising, Autoneg); > > + cmd->base.autoneg = AUTONEG_ENABLE; > > + if (gaudi_nic->auto_neg_resolved) > > + ethtool_add_mode(cmd, lp_advertising, Autoneg); > > + } else { > > + cmd->base.autoneg = AUTONEG_DISABLE; > > + } > > + > > + ethtool_add_mode(cmd, supported, Pause); > > + > > + if (gaudi_nic->pfc_enable) > > + ethtool_add_mode(cmd, advertising, Pause); > > + > > + return 0; > > +} > > + > > +static int gaudi_nic_set_link_ksettings(struct net_device *netdev, > > + const struct ethtool_link_ksettings *cmd) > > +{ > > + struct gaudi_nic_device **ptr = netdev_priv(netdev); > > + struct gaudi_nic_device *gaudi_nic = *ptr; > > + struct hl_device *hdev = gaudi_nic->hdev; > > + u32 port = gaudi_nic->port; > > + int rc = 0, speed = cmd->base.speed; > > + bool auto_neg = cmd->base.autoneg == AUTONEG_ENABLE; > > It appears you only support speed and auto_neg. You should check that > all other things which could be configured are empty, e.g. none of the > bits are set in cmd->link_modes.advertising. If you are requested to > configure something which is not supported, you need to return > -EOPNOTSUPP. > > Andrew Thanks Andrew, We will do that and send an updated version. Oded
> Can you please elaborate on how to do this with a single driver that > is already in misc ? > As I mentioned in the cover letter, we are not developing a > stand-alone NIC. We have a deep-learning accelerator with a NIC > interface. This sounds like an MFD. Andrew
On Thu, 10 Sep 2020 23:16:22 +0300 Oded Gabbay wrote: > On Thu, Sep 10, 2020 at 11:01 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 10 Sep 2020 19:11:11 +0300 Oded Gabbay wrote: > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.c > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.h > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_dcbnl.c > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_debugfs.c > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_ethtool.c > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_phy.c > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_masks.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm1_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_masks.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc1_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxb_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_masks.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe1_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_stat_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_tmr_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_masks.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe1_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_masks.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs1_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm0_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm1_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm0_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm1_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm0_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm1_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm0_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm1_regs.h > > > create mode 100644 drivers/misc/habanalabs/include/hw_ip/nic/nic_general.h > > > > The relevant code needs to live under drivers/net/(ethernet/). > > For one thing our automation won't trigger for drivers in random > > (/misc) part of the tree. > > Can you please elaborate on how to do this with a single driver that > is already in misc ? > As I mentioned in the cover letter, we are not developing a > stand-alone NIC. We have a deep-learning accelerator with a NIC > interface. > Therefore, we don't have a separate PCI physical function for the NIC > and I can't have a second driver registering to it. Is it not possible to move the files and still build them into a single module? > We did this design based on existing examples in the kernel > (registering to netdev), such as (just a few examples): > 1. sgi-xp driver in drivers/misc/sgi-xp (see file xpnet.c) > 2. bnx2fc in drivers/scsi/bnx2fc > > Thanks, > Oded
On Thu, Sep 10, 2020 at 11:25 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > Can you please elaborate on how to do this with a single driver that > > is already in misc ? > > As I mentioned in the cover letter, we are not developing a > > stand-alone NIC. We have a deep-learning accelerator with a NIC > > interface. > > This sounds like an MFD. > > Andrew Yes and no. There is only one functionality - training of deep learning (Accelerating compute operations) :) The rdma is just our method of scaling-out - our method of intra-connection between GAUDI devices (similar to NVlink or AMD crossfire). So the H/W exposes a single physical function at the PCI level. And thus Linux can call a single driver for it during the PCI probe. I hope that in future generations we will improve that, but it is what it is for GAUDI. I don't see how to do it otherwise currently but if you have ideas please share. Oded
On Thu, Sep 10, 2020 at 11:28 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 10 Sep 2020 23:16:22 +0300 Oded Gabbay wrote: > > On Thu, Sep 10, 2020 at 11:01 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > On Thu, 10 Sep 2020 19:11:11 +0300 Oded Gabbay wrote: > > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.c > > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.h > > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_dcbnl.c > > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_debugfs.c > > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_ethtool.c > > > > create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_phy.c > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_masks.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm1_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_masks.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc1_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxb_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_masks.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe1_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_stat_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_tmr_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_masks.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe1_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_masks.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs1_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm0_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm1_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm0_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm1_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm0_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm1_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm0_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm1_regs.h > > > > create mode 100644 drivers/misc/habanalabs/include/hw_ip/nic/nic_general.h > > > > > > The relevant code needs to live under drivers/net/(ethernet/). > > > For one thing our automation won't trigger for drivers in random > > > (/misc) part of the tree. > > > > Can you please elaborate on how to do this with a single driver that > > is already in misc ? > > As I mentioned in the cover letter, we are not developing a > > stand-alone NIC. We have a deep-learning accelerator with a NIC > > interface. > > Therefore, we don't have a separate PCI physical function for the NIC > > and I can't have a second driver registering to it. > > Is it not possible to move the files and still build them into a single > module? hmm... I actually didn't try that as I thought it will be very strange and I'm not familiar with other drivers that build as a single ko but have files spread out in different subsystems. I don't feel it is a better option than what we did here. Will I need to split pull requests to different subsystem maintainers ? For the same driver ? Sounds to me this is not going to fly. Thanks, Oded > > > We did this design based on existing examples in the kernel > > (registering to netdev), such as (just a few examples): > > 1. sgi-xp driver in drivers/misc/sgi-xp (see file xpnet.c) > > 2. bnx2fc in drivers/scsi/bnx2fc > > > > Thanks, > > Oded >
On Thu, Sep 10, 2020 at 11:38 PM Andrew Lunn <andrew@lunn.ch> wrote: > > On Thu, Sep 10, 2020 at 11:30:33PM +0300, Oded Gabbay wrote: > > On Thu, Sep 10, 2020 at 11:25 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > > > Can you please elaborate on how to do this with a single driver that > > > > is already in misc ? > > > > As I mentioned in the cover letter, we are not developing a > > > > stand-alone NIC. We have a deep-learning accelerator with a NIC > > > > interface. > > > > > > This sounds like an MFD. > > > > > > Andrew > > > > Yes and no. There is only one functionality - training of deep > > learning (Accelerating compute operations) :) > > The rdma is just our method of scaling-out - our method of > > intra-connection between GAUDI devices (similar to NVlink or AMD > > crossfire). > > So the H/W exposes a single physical function at the PCI level. And > > thus Linux can call a single driver for it during the PCI probe. > > Yes, it probes the MFD driver. The MFD driver then creates platform > drivers for the sub functions. i.e. it would create an Ethernet > platform driver. That then gets probed in the usual way. The child > device can get access to the parent device, if it needs to share > things, e.g. a device on a bus. This is typically I2C or SPI, but > there is no reason it cannot be a PCI device. > > Go look in drivers/mfd. > > Andrew I'm slightly familiar with drivers/mfd and as you mentioned, those are for "simple" devices, which use a bus with different functionality on them, like I2C with many devices (sensors for various things, etc). I've never seen anyone doing a PCI device there and frankly, I don't see the benefit of trying to migrate our complex PCI driver to that subsystem, if it will even work. And I would like to reiterate that our NIC ports are highly integrated with our compute engines. They "talk" to each other via sync objects inside the SOC, and all of them are used as part of the training of the deep learning network. Another example why this is not MFD - when a compute engine gets stuck, all the NIC ports are going through reset. So it's not the same as multiple devices that use the same bus or H/W. It's a single device with some engines that work in harmony. The bottom line is that we have single functionality and the scale-out is done via RDMA that is integrated on the device. We could have chosen other ways to scale-out (like some proprietary bus) and then would that count as another functionality ? I think not. So I'm not going to drivers/mfd with our driver. I wish that I had multiple PCI PF so I could do a proper Ethernet driver but I can't for this H/W. And I think that physically splitting the files into two subsystems will be very hard to maintain and definitely I will want to hear Greg's opinion on that. Thanks, Oded
On Fri, Sep 11, 2020 at 12:05 AM Florian Fainelli <f.fainelli@gmail.com> wrote: > > > > On 9/10/2020 1:32 PM, Oded Gabbay wrote: > > On Thu, Sep 10, 2020 at 11:28 PM Jakub Kicinski <kuba@kernel.org> wrote: > >> > >> On Thu, 10 Sep 2020 23:16:22 +0300 Oded Gabbay wrote: > >>> On Thu, Sep 10, 2020 at 11:01 PM Jakub Kicinski <kuba@kernel.org> wrote: > >>>> On Thu, 10 Sep 2020 19:11:11 +0300 Oded Gabbay wrote: > >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.c > >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.h > >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_dcbnl.c > >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_debugfs.c > >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_ethtool.c > >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_phy.c > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_masks.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm1_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_masks.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc1_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxb_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_masks.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe1_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_stat_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_tmr_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_masks.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe1_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_masks.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs1_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm0_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm1_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm0_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm1_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm0_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm1_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm0_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm1_regs.h > >>>>> create mode 100644 drivers/misc/habanalabs/include/hw_ip/nic/nic_general.h > >>>> > >>>> The relevant code needs to live under drivers/net/(ethernet/). > >>>> For one thing our automation won't trigger for drivers in random > >>>> (/misc) part of the tree. > >>> > >>> Can you please elaborate on how to do this with a single driver that > >>> is already in misc ? > >>> As I mentioned in the cover letter, we are not developing a > >>> stand-alone NIC. We have a deep-learning accelerator with a NIC > >>> interface. > >>> Therefore, we don't have a separate PCI physical function for the NIC > >>> and I can't have a second driver registering to it. > >> > >> Is it not possible to move the files and still build them into a single > >> module? > > hmm... > > I actually didn't try that as I thought it will be very strange and > > I'm not familiar with other drivers that build as a single ko but have > > files spread out in different subsystems. > > I don't feel it is a better option than what we did here. > > > > Will I need to split pull requests to different subsystem maintainers > > ? For the same driver ? > > Sounds to me this is not going to fly. > > Not necessarily, you can post your patches to all relevant lists and > seek maintainer review/acked-by tags from the relevant maintainers. This > is not unheard of with mlx5 for instance. Yeah, I see what you are saying, the problem is that sometimes, because everything is tightly integrated in our SOC, the patches contain code from common code (common to ALL our ASICs, even those who don't have NIC at all), GAUDI specific code which is not NIC related and the NIC code itself. But I guess that as a last resort if this is a *must* I can do that. Though I would like to hear Greg's opinion on this as he is my current maintainer. Personally I do want to send relevant patches to netdev because I want to get your expert reviews on them, but still keep the code in a single location. > > Have you considered using notifiers to get your NIC driver registered > while the NIC code lives in a different module? Yes, and I prefered to keep it simple. I didn't want to start sending notifications to the NIC driver every time, for example, I needed to reset the SOC because a compute engine got stuck. Or vice-versa - when some error happened in the NIC to start sending notifications to the common driver. In addition, from my AMD days, we had a very tough time managing two drivers that "talk" to each other and manage the same H/W. I'm talking about amdgpu for graphics and amdkfd for compute (which I was the maintainer). AMD is working in the past years to unite those two drivers to get out of that mess. That's why I didn't want to go down that road. Thanks, Oded > -- > Florian
On 9/10/2020 2:15 PM, Oded Gabbay wrote: > On Fri, Sep 11, 2020 at 12:05 AM Florian Fainelli <f.fainelli@gmail.com> wrote: >> >> >> >> On 9/10/2020 1:32 PM, Oded Gabbay wrote: >>> On Thu, Sep 10, 2020 at 11:28 PM Jakub Kicinski <kuba@kernel.org> wrote: >>>> >>>> On Thu, 10 Sep 2020 23:16:22 +0300 Oded Gabbay wrote: >>>>> On Thu, Sep 10, 2020 at 11:01 PM Jakub Kicinski <kuba@kernel.org> wrote: >>>>>> On Thu, 10 Sep 2020 19:11:11 +0300 Oded Gabbay wrote: >>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.c >>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.h >>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_dcbnl.c >>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_debugfs.c >>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_ethtool.c >>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_phy.c >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_masks.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm1_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_masks.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc1_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxb_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_masks.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe1_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_stat_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_tmr_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_masks.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe1_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_masks.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs1_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm0_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm1_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm0_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm1_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm0_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm1_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm0_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm1_regs.h >>>>>>> create mode 100644 drivers/misc/habanalabs/include/hw_ip/nic/nic_general.h >>>>>> >>>>>> The relevant code needs to live under drivers/net/(ethernet/). >>>>>> For one thing our automation won't trigger for drivers in random >>>>>> (/misc) part of the tree. >>>>> >>>>> Can you please elaborate on how to do this with a single driver that >>>>> is already in misc ? >>>>> As I mentioned in the cover letter, we are not developing a >>>>> stand-alone NIC. We have a deep-learning accelerator with a NIC >>>>> interface. >>>>> Therefore, we don't have a separate PCI physical function for the NIC >>>>> and I can't have a second driver registering to it. >>>> >>>> Is it not possible to move the files and still build them into a single >>>> module? >>> hmm... >>> I actually didn't try that as I thought it will be very strange and >>> I'm not familiar with other drivers that build as a single ko but have >>> files spread out in different subsystems. >>> I don't feel it is a better option than what we did here. >>> >>> Will I need to split pull requests to different subsystem maintainers >>> ? For the same driver ? >>> Sounds to me this is not going to fly. >> >> Not necessarily, you can post your patches to all relevant lists and >> seek maintainer review/acked-by tags from the relevant maintainers. This >> is not unheard of with mlx5 for instance. > Yeah, I see what you are saying, the problem is that sometimes, > because everything is tightly integrated in our SOC, the patches > contain code from common code (common to ALL our ASICs, even those who > don't have NIC at all), GAUDI specific code which is not NIC related > and the NIC code itself. > But I guess that as a last resort if this is a *must* I can do that. > Though I would like to hear Greg's opinion on this as he is my current > maintainer. > > Personally I do want to send relevant patches to netdev because I want > to get your expert reviews on them, but still keep the code in a > single location. We do have network drivers sprinkled across the kernel tree already, but I would agree that from a networking maintainer perspective this makes auditing code harder, you would naturally grep for net/ and drivers/net and easily miss arch/uml/ for instance. When you do treewide changes, having all your ducklings in the same pond is a lot easier. There is a possible "risk" with posting a patch series for the habanalabs driver to netdev that people will be wondering what this is about and completely miss it is about the networking bits. If there is a NIC driver under drivers/net then people will start to filter or pay attention based on the directory. > >> >> Have you considered using notifiers to get your NIC driver registered >> while the NIC code lives in a different module? > Yes, and I prefered to keep it simple. I didn't want to start sending > notifications to the NIC driver every time, for example, I needed to > reset the SOC because a compute engine got stuck. Or vice-versa - when > some error happened in the NIC to start sending notifications to the > common driver. > > In addition, from my AMD days, we had a very tough time managing two > drivers that "talk" to each other and manage the same H/W. I'm talking > about amdgpu for graphics and amdkfd for compute (which I was the > maintainer). AMD is working in the past years to unite those two > drivers to get out of that mess. That's why I didn't want to go down > that road. You are trading an indirect call for a direct call, and it does provide some nice interface, but it could be challenging to work with given the context in which the notifier is called can be problematic. You could still have direct module references then, and that would avoid the need for notifiers. You are the driver maintainer, so you definitively have a bigger say in the matter than most of us, drive by contributors.
On Thu, Sep 10, 2020 at 11:03 PM Jakub Kicinski <kuba@kernel.org> wrote: > On Thu, 10 Sep 2020 19:11:16 +0300 Oded Gabbay wrote: > > +module_param(nic_rx_poll, int, 0444); > MODULE_PARM_DESC(nic_rx_poll, > > + "Enable NIC Rx polling mode (0 = no, 1 = yes, default no)"); > > If your chip does not support IRQ coalescing you can configure polling and the > timeout via ethtool -C, rather than a module parameter. I couldn't find an example for that in other drivers and I didn't see anything regarding polling mode in the parameters description of this ethtool callback. Can you please specify some pointer for that? Or in other words, what parameter can we use to enable polling/setting the timeout? Thanks, Omer
On Mon, 14 Sep 2020 09:52:00 +0000 Omer Shpigelman wrote: > On Thu, Sep 10, 2020 at 11:03 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 10 Sep 2020 19:11:16 +0300 Oded Gabbay wrote: > > > +module_param(nic_rx_poll, int, 0444); > > MODULE_PARM_DESC(nic_rx_poll, > > > + "Enable NIC Rx polling mode (0 = no, 1 = yes, default no)"); > > > > If your chip does not support IRQ coalescing you can configure polling and the > > timeout via ethtool -C, rather than a module parameter. > > I couldn't find an example for that in other drivers and I didn't see > anything regarding polling mode in the parameters description of this > ethtool callback. > Can you please specify some pointer for that? Or in other words, what > parameter can we use to enable polling/setting the timeout? Look at stmmac, hip04_eth..