mbox series

[0/3] can: c_can: cache frames to operate as a true FIFO

Message ID 20210509124309.30024-1-dariobin@libero.it
Headers show
Series can: c_can: cache frames to operate as a true FIFO | expand

Message

Dario Binacchi May 9, 2021, 12:43 p.m. UTC
Performance tests of the c_can driver led to the patch that gives the
series its name. We have also added a patch for ethtool support and a
patch to remove a variable that is no longer used.


Dario Binacchi (3):
  can: c_can: remove the rxmasked unused variable
  can: c_can: add ethtool support
  can: c_can: cache frames to operate as a true FIFO

 drivers/net/can/c_can/Makefile                |  3 +
 drivers/net/can/c_can/c_can.h                 |  6 +-
 drivers/net/can/c_can/c_can_ethtool.c         | 46 +++++++++++++
 .../net/can/c_can/{c_can.c => c_can_main.c}   | 65 +++++++++++++++----
 4 files changed, 107 insertions(+), 13 deletions(-)
 create mode 100644 drivers/net/can/c_can/c_can_ethtool.c
 rename drivers/net/can/c_can/{c_can.c => c_can_main.c} (95%)

Comments

Marc Kleine-Budde May 10, 2021, 12:25 p.m. UTC | #1
On 09.05.2021 14:43:09, Dario Binacchi wrote:
> As reported by a comment in the c_can_start_xmit() this was not a FIFO.

> C/D_CAN controller sends out the buffers prioritized so that the lowest

> buffer number wins.

> 

> What did c_can_start_xmit() do if it found tx_active = 0x80000000 ? It

> waited until the only frame of the FIFO was actually transmitted by the

> controller. Only one message in the FIFO but we had to wait for it to

> empty completely to ensure that the messages were transmitted in the

> order in which they were loaded.

> 

> By storing the frames in the FIFO without requiring its transmission, we

> will be able to use the full size of the FIFO even in cases such as the

> one described above. The transmission interrupt will trigger their

> transmission only when all the messages previously loaded but stored in

> less priority positions of the buffers have been transmitted.


The algorithm you implemented looks a bit too complicated to me. Let me
sketch the algorithm that's implemented by several other drivers.

- have a power of two number of TX objects
- add a number of objects to struct priv (tx_num)
  (or make it a define, if the number of tx objects is compile time fixed)
- add two "unsigned int" variables to your struct priv,
  one "tx_head", one "tx_tail"
- the hard_start_xmit() writes to priv->tx_head & (priv->tx_num - 1)
- increment tx_head
- stop the tx_queue if there is no space or if the object with the
  lowest prio has been written
- in TX complete IRQ, handle priv->tx_tail object
- increment tx_tail
- wake queue if there is space but don't wake if we wait for the lowest
  prio object to be TX completed.

Special care needs to be taken to implement that lock-less and race
free. I suggest to look the the mcp251xfd driver.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |
Marc Kleine-Budde May 10, 2021, 12:36 p.m. UTC | #2
On 10.05.2021 14:25:15, Marc Kleine-Budde wrote:
> On 09.05.2021 14:43:09, Dario Binacchi wrote:

> > As reported by a comment in the c_can_start_xmit() this was not a FIFO.

> > C/D_CAN controller sends out the buffers prioritized so that the lowest

> > buffer number wins.

> > 

> > What did c_can_start_xmit() do if it found tx_active = 0x80000000 ? It

> > waited until the only frame of the FIFO was actually transmitted by the

> > controller. Only one message in the FIFO but we had to wait for it to

> > empty completely to ensure that the messages were transmitted in the

> > order in which they were loaded.

> > 

> > By storing the frames in the FIFO without requiring its transmission, we

> > will be able to use the full size of the FIFO even in cases such as the

> > one described above. The transmission interrupt will trigger their

> > transmission only when all the messages previously loaded but stored in

> > less priority positions of the buffers have been transmitted.

> 

> The algorithm you implemented looks a bit too complicated to me. Let me

> sketch the algorithm that's implemented by several other drivers.

> 

> - have a power of two number of TX objects

> - add a number of objects to struct priv (tx_num)

>   (or make it a define, if the number of tx objects is compile time fixed)

> - add two "unsigned int" variables to your struct priv,

>   one "tx_head", one "tx_tail"

> - the hard_start_xmit() writes to priv->tx_head & (priv->tx_num - 1)

> - increment tx_head

> - stop the tx_queue if there is no space or if the object with the

>   lowest prio has been written

> - in TX complete IRQ, handle priv->tx_tail object

> - increment tx_tail

> - wake queue if there is space but don't wake if we wait for the lowest

>   prio object to be TX completed.

> 

> Special care needs to be taken to implement that lock-less and race

> free. I suggest to look the the mcp251xfd driver.


After converting the driver to the above outlined implementation it
should be more straight forward to add the caching you implemented.  

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |
Marc Kleine-Budde May 10, 2021, 1:29 p.m. UTC | #3
On 09.05.2021 14:43:07, Dario Binacchi wrote:
> Initialized by c_can_chip_config() it's never used.

> 

> Signed-off-by: Dario Binacchi <dariobin@libero.it>


applied to linux-can-next/testing

thanks,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |
Dario Binacchi May 13, 2021, 11:23 a.m. UTC | #4
Hi Marc,

> Il 10/05/2021 14:36 Marc Kleine-Budde <mkl@pengutronix.de> ha scritto:

> 

>  

> On 10.05.2021 14:25:15, Marc Kleine-Budde wrote:

> > On 09.05.2021 14:43:09, Dario Binacchi wrote:

> > > As reported by a comment in the c_can_start_xmit() this was not a FIFO.

> > > C/D_CAN controller sends out the buffers prioritized so that the lowest

> > > buffer number wins.

> > > 

> > > What did c_can_start_xmit() do if it found tx_active = 0x80000000 ? It

> > > waited until the only frame of the FIFO was actually transmitted by the

> > > controller. Only one message in the FIFO but we had to wait for it to

> > > empty completely to ensure that the messages were transmitted in the

> > > order in which they were loaded.

> > > 

> > > By storing the frames in the FIFO without requiring its transmission, we

> > > will be able to use the full size of the FIFO even in cases such as the

> > > one described above. The transmission interrupt will trigger their

> > > transmission only when all the messages previously loaded but stored in

> > > less priority positions of the buffers have been transmitted.

> > 

> > The algorithm you implemented looks a bit too complicated to me. Let me

> > sketch the algorithm that's implemented by several other drivers.

> > 

> > - have a power of two number of TX objects

> > - add a number of objects to struct priv (tx_num)

> >   (or make it a define, if the number of tx objects is compile time fixed)

> > - add two "unsigned int" variables to your struct priv,

> >   one "tx_head", one "tx_tail"

> > - the hard_start_xmit() writes to priv->tx_head & (priv->tx_num - 1)

> > - increment tx_head

> > - stop the tx_queue if there is no space or if the object with the

> >   lowest prio has been written

> > - in TX complete IRQ, handle priv->tx_tail object

> > - increment tx_tail

> > - wake queue if there is space but don't wake if we wait for the lowest

> >   prio object to be TX completed.

> > 

> > Special care needs to be taken to implement that lock-less and race

> > free. I suggest to look the the mcp251xfd driver.

> 

> After converting the driver to the above outlined implementation it

> should be more straight forward to add the caching you implemented.  

> 


I took some time to think about your suggestions.
The submitted patch was developed trying to improve the
CAN transmission using the current driver design for minimize
the creation of bugs.
If I'm not missing something you suggest me to change the
driver design as a pre-condition to apply an updated version
of my patch. IMHO this would increase the possibility of generating
bugs, even for parts of the code that are considered stable.
If the algorithm I have implemented is a bit too complicated,
let's try to simplify it starting from the submitted patch.

Waiting for your reply, thanks and regards
Dario

> regards,

> Marc

> 

> -- 

> Pengutronix e.K.                 | Marc Kleine-Budde           |

> Embedded Linux                   | https://www.pengutronix.de  |

> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |

> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |