mbox series

[v3,net-next,00/10] Configuring congestion watermarks on ocelot switch using devlink-sb

Message ID 20210108175950.484854-1-olteanv@gmail.com
Headers show
Series Configuring congestion watermarks on ocelot switch using devlink-sb | expand

Message

Vladimir Oltean Jan. 8, 2021, 5:59 p.m. UTC
From: Vladimir Oltean <vladimir.oltean@nxp.com>

In some applications, it is important to create resource reservations in
the Ethernet switches, to prevent background traffic, or deliberate
attacks, from inducing denial of service into the high-priority traffic.

These patches give the user some knobs to turn. The ocelot switches
support per-port and per-port-tc reservations, on ingress and on egress.
The resources that are monitored are packet buffers (in cells of 60
bytes each) and frame references.

The frames that exceed the reservations can optionally consume from
sharing watermarks which are not per-port but global across the switch.
There are 10 sharing watermarks, 8 of them are per traffic class and 2
are per drop priority.

I am configuring the hardware using the best of my knowledge, and mostly
through trial and error. Same goes for devlink-sb integration. Feedback
is welcome.

Vladimir Oltean (10):
  net: mscc: ocelot: auto-detect packet buffer size and number of frame
    references
  net: mscc: ocelot: add ops for decoding watermark threshold and
    occupancy
  net: dsa: add ops for devlink-sb
  net: dsa: felix: reindent struct dsa_switch_ops
  net: dsa: felix: perform teardown in reverse order of setup
  net: mscc: ocelot: export NUM_TC constant from felix to common switch
    lib
  net: mscc: ocelot: delete unused ocelot_set_cpu_port prototype
  net: mscc: ocelot: register devlink ports
  net: mscc: ocelot: initialize watermarks to sane defaults
  net: mscc: ocelot: configure watermarks using devlink-sb

 drivers/net/dsa/ocelot/felix.c             | 210 +++--
 drivers/net/dsa/ocelot/felix.h             |   2 -
 drivers/net/dsa/ocelot/felix_vsc9959.c     |  23 +-
 drivers/net/dsa/ocelot/seville_vsc9953.c   |  20 +-
 drivers/net/ethernet/mscc/Makefile         |   3 +-
 drivers/net/ethernet/mscc/ocelot.c         |  18 +-
 drivers/net/ethernet/mscc/ocelot.h         |   8 +-
 drivers/net/ethernet/mscc/ocelot_devlink.c | 885 +++++++++++++++++++++
 drivers/net/ethernet/mscc/ocelot_net.c     | 282 ++++++-
 drivers/net/ethernet/mscc/ocelot_vsc7514.c |  47 +-
 include/net/dsa.h                          |  34 +
 include/soc/mscc/ocelot.h                  |  54 +-
 include/soc/mscc/ocelot_qsys.h             |   7 +-
 net/dsa/dsa2.c                             | 159 +++-
 14 files changed, 1657 insertions(+), 95 deletions(-)
 create mode 100644 drivers/net/ethernet/mscc/ocelot_devlink.c

Comments

Florian Fainelli Jan. 8, 2021, 6:30 p.m. UTC | #1
On 1/8/2021 9:59 AM, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> We'll need to read back the watermark thresholds and occupancy from
> hardware (for devlink-sb integration), not only to write them as we did
> so far in ocelot_port_set_maxlen. So introduce 2 new functions in struct
> ocelot_ops, similar to wm_enc, and implement them for the 3 supported
> mscc_ocelot switches.
> 
> Remove the INUSE and MAXUSE unpacking helpers for the QSYS_RES_STAT
> register, because that doesn't scale with the number of switches that
> mscc_ocelot supports now. They have different bit widths for the
> watermarks, and we need function pointers to abstract that difference
> away.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Florian Fainelli Jan. 8, 2021, 6:33 p.m. UTC | #2
On 1/8/2021 9:59 AM, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> The devlink function pointer names are super long, and they would break
> the alignment. So reindent the existing ops now by adding one tab.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Florian Fainelli Jan. 8, 2021, 6:34 p.m. UTC | #3
On 1/8/2021 9:59 AM, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> We should be moving anything that isn't DSA-specific or SoC-specific out
> of the felix DSA driver, and into the common mscc_ocelot switch library.
> 
> The number of traffic classes is one of the aspects that is common
> between all ocelot switches, so it belongs in the library.
> 
> This patch also makes seville use 8 TX queues, and therefore enables
> prioritization via the QOS_CLASS field in the NPI injection header.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Florian Fainelli Jan. 9, 2021, 4:03 a.m. UTC | #4
On 1/8/2021 9:59 AM, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>

> 

> Using devlink-sb, we can configure 12/16 (the important 75%) of the

> switch's controlling watermarks for congestion drops, and we can monitor

> 50% of the watermark occupancies (we can monitor the reservation

> watermarks, but not the sharing watermarks, which are exposed as pool

> sizes).

> 

> The following definitions can be made:

> 

> SB_BUF=0 # The devlink-sb for frame buffers

> SB_REF=1 # The devlink-sb for frame references

> POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances

>            # have one of these.

> POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances

>            # have one of these.

> 

> Editing the hardware watermarks is done in the following way:

> BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING

> REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING

> BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR

> REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR

> 

> Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly

> by modifying the corresponding pool size. By default, the pool size has

> maximum size, so this can be skipped.

> 

> devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \

> 	size 103872 thtype static

> 

> Since by default there is no buffer reservation, the above command has

> maxed out BUF_COL_SHR_I(dp=0).

> 

> Configuring the per-port reservation watermark (P_RSRV) is done in the

> following way:

> 

> devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \

> 	pool $POOL_ING th 1000

> 

> The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this

> command, the sharing watermarks are internally reconfigured with 1000

> bytes less, i.e. from 103872 bytes to 102872 bytes.

> 

> Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in

> the following way:

> 

> for tc in {0..7}; do

> 	devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \

> 		type ingress pool $POOL_ING \

> 		th 3000

> done

> 

> The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes.

> The sharing watermarks are again reconfigured with 24000 bytes less.

> 

> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>


Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

-- 
Florian
Jakub Kicinski Jan. 10, 2021, 1:20 a.m. UTC | #5
On Fri,  8 Jan 2021 19:59:42 +0200 Vladimir Oltean wrote:
> +	*inuse = (val & GENMASK(23, 12)) >> 12;


FWIW FIELD_GET()
Jakub Kicinski Jan. 10, 2021, 1:24 a.m. UTC | #6
On Fri,  8 Jan 2021 19:59:44 +0200 Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>

> 

> The devlink function pointer names are super long, and they would break

> the alignment. So reindent the existing ops now by adding one tab.


Therefore it'd be tempting to prefix them with dl_ rather than full
devlink_
Vladimir Oltean Jan. 11, 2021, 4:53 p.m. UTC | #7
On Sat, Jan 09, 2021 at 05:20:46PM -0800, Jakub Kicinski wrote:
> On Fri,  8 Jan 2021 19:59:42 +0200 Vladimir Oltean wrote:

> > +	*inuse = (val & GENMASK(23, 12)) >> 12;

> 

> FWIW FIELD_GET()


Do you mind if I don't use it? I don't feel that:
	*inuse = FIELD_GET(GENMASK(23, 12), val);
looks any better.
Vladimir Oltean Jan. 11, 2021, 5:01 p.m. UTC | #8
On Sat, Jan 09, 2021 at 05:24:19PM -0800, Jakub Kicinski wrote:
> On Fri,  8 Jan 2021 19:59:44 +0200 Vladimir Oltean wrote:

> > From: Vladimir Oltean <vladimir.oltean@nxp.com>

> >

> > The devlink function pointer names are super long, and they would break

> > the alignment. So reindent the existing ops now by adding one tab.

>

> Therefore it'd be tempting to prefix them with dl_ rather than full

> devlink_


Indentation is broken even with devlink_sb_occ_tc_port_bind_get reduced
to dl_sb_occ_tc_port_bind_get.
Jakub Kicinski Jan. 11, 2021, 7:10 p.m. UTC | #9
On Mon, 11 Jan 2021 18:53:21 +0200 Vladimir Oltean wrote:
> On Sat, Jan 09, 2021 at 05:20:46PM -0800, Jakub Kicinski wrote:

> > On Fri,  8 Jan 2021 19:59:42 +0200 Vladimir Oltean wrote:  

> > > +	*inuse = (val & GENMASK(23, 12)) >> 12;  

> > 

> > FWIW FIELD_GET()  

> 

> Do you mind if I don't use it? I don't feel that:

> 	*inuse = FIELD_GET(GENMASK(23, 12), val);

> looks any better.


Not at all, your call.
Jakub Kicinski Jan. 11, 2021, 7:12 p.m. UTC | #10
On Mon, 11 Jan 2021 19:01:58 +0200 Vladimir Oltean wrote:
> On Sat, Jan 09, 2021 at 05:24:19PM -0800, Jakub Kicinski wrote:

> > On Fri,  8 Jan 2021 19:59:44 +0200 Vladimir Oltean wrote:  

> > > From: Vladimir Oltean <vladimir.oltean@nxp.com>

> > >

> > > The devlink function pointer names are super long, and they would break

> > > the alignment. So reindent the existing ops now by adding one tab.  

> >

> > Therefore it'd be tempting to prefix them with dl_ rather than full

> > devlink_  

> 

> Indentation is broken even with devlink_sb_occ_tc_port_bind_get reduced

> to dl_sb_occ_tc_port_bind_get.


Ack, still, shorter is better IMHO.