mbox series

[net,v2,0/2] Fix link_mode derived params functionality

Message ID 20210404081433.1260889-1-danieller@nvidia.com
Headers show
Series Fix link_mode derived params functionality | expand

Message

Danielle Ratson April 4, 2021, 8:14 a.m. UTC
Currently, link_mode parameter derives 3 other link parameters, speed,
lanes and duplex, and the derived information is sent to user space.

Two bugs were found in that functionality.
First, some drivers clear the 'ethtool_link_ksettings' struct in their
get_link_ksettings() callback and cause receiving wrong link mode
information in user space. And also, some drivers can report random
values in the 'link_mode' field and cause general protection fault.

Second, the link parameters are only derived in netlink path so in ioctl
path, we don't any reasonable values.

Patch #1 solves the first problem by introducing a new capability bit for
supporting link_mode in driver.
Patch #2 solves the second one, by deriving the parameters in ioctl path
as well.

v2:
	* Add patch #2.
	* Introduce 'cap_link_mode_supported' instead of adding a
	  validity field to 'ethtool_link_ksettings' struct in patch #1.

Danielle Ratson (2):
  ethtool: Add link_mode parameter capability bit to ethtool_ops
  ethtool: Derive parameters from link_mode in ioctl path

 .../mellanox/mlxsw/spectrum_ethtool.c         |  1 +
 include/linux/ethtool.h                       |  5 ++-
 net/ethtool/ioctl.c                           | 34 ++++++++++++++-----
 3 files changed, 31 insertions(+), 9 deletions(-)

Comments

Andrew Lunn April 4, 2021, 2:18 p.m. UTC | #1
On Sun, Apr 04, 2021 at 11:14:32AM +0300, Danielle Ratson wrote:
> Some drivers clear the 'ethtool_link_ksettings' struct in their
> get_link_ksettings() callback, before populating it with actual values.
> Such drivers will set the new 'link_mode' field to zero, resulting in
> user space receiving wrong link mode information given that zero is a
> valid value for the field.

That sounds like the bug, that 0 means something, not that the
structure is zeroed. It is good practice to zero structures about to
be returned to user space otherwise you could leak stack information
etc.

Do we still have time to fix the KAPI, so zero means unset? Where in
the workflow is the patch adding this feature? Is it in a released
kernel? or -rc kernel?

> Fix this by introducing a new capability bit ('cap_link_mode_supported')
> to ethtool_ops, which indicates whether the driver supports 'link_mode'
> parameter or not. Set it to true in mlxsw which is currently the only
> driver supporting 'link_mode'.
> 
> Another problem is that some drivers (notably tun) can report random
> values in the 'link_mode' field. This can result in a general protection
> fault when the field is used as an index to the 'link_mode_params' array
> [1].
> 
> This happens because such drivers implement their set_link_ksettings()
> callback by simply overwriting their private copy of
> 'ethtool_link_ksettings' struct with the one they get from the stack,
> which is not always properly initialized.
> 
> Fix this by making sure that the implied parameters will be derived from
> the 'link_mode' parameter only when the 'cap_link_mode_supported' is set.

So you left in place the kernel memory leak to user space? Or is there
an additional patch to fix tun as well?

   Andrew
Danielle Ratson April 4, 2021, 3:04 p.m. UTC | #2
> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Sunday, April 4, 2021 5:18 PM
> To: Danielle Ratson <danieller@nvidia.com>
> Cc: netdev@vger.kernel.org; davem@davemloft.net; kuba@kernel.org; eric.dumazet@gmail.com; mkubecek@suse.cz;
> f.fainelli@gmail.com; acardace@redhat.com; irusskikh@marvell.com; gustavo@embeddedor.com; magnus.karlsson@intel.com;
> ecree@solarflare.com; Ido Schimmel <idosch@nvidia.com>; Jiri Pirko <jiri@nvidia.com>; mlxsw <mlxsw@nvidia.com>
> Subject: Re: [PATCH net v2 1/2] ethtool: Add link_mode parameter capability bit to ethtool_ops
> 
> On Sun, Apr 04, 2021 at 11:14:32AM +0300, Danielle Ratson wrote:
> > Some drivers clear the 'ethtool_link_ksettings' struct in their
> > get_link_ksettings() callback, before populating it with actual values.
> > Such drivers will set the new 'link_mode' field to zero, resulting in
> > user space receiving wrong link mode information given that zero is a
> > valid value for the field.
> 
> That sounds like the bug, that 0 means something, not that the structure is zeroed. It is good practice to zero structures about to be
> returned to user space otherwise you could leak stack information etc.
> 
> Do we still have time to fix the KAPI, so zero means unset? Where in the workflow is the patch adding this feature? Is it in a released
> kernel? or -rc kernel?

First, it is not the API structure that is passed to user space. Please pay attention that the link_mode field is added to "ethtool_link_ksettings" and not "ethtool_link_settings", which is the API.
So I am not sure what leak could happen in that situation, could you explain?

Second, the link_mode indices are uAPI, so 0 is not forbidden.

> 
> > Fix this by introducing a new capability bit
> > ('cap_link_mode_supported') to ethtool_ops, which indicates whether the driver supports 'link_mode'
> > parameter or not. Set it to true in mlxsw which is currently the only
> > driver supporting 'link_mode'.
> >
> > Another problem is that some drivers (notably tun) can report random
> > values in the 'link_mode' field. This can result in a general
> > protection fault when the field is used as an index to the
> > 'link_mode_params' array [1].
> >
> > This happens because such drivers implement their set_link_ksettings()
> > callback by simply overwriting their private copy of
> > 'ethtool_link_ksettings' struct with the one they get from the stack,
> > which is not always properly initialized.
> >
> > Fix this by making sure that the implied parameters will be derived
> > from the 'link_mode' parameter only when the 'cap_link_mode_supported' is set.
> 
> So you left in place the kernel memory leak to user space? Or is there an additional patch to fix tun as well?
> 
>    Andrew

Again, not sure what is the memory leak you are talking about. 
This patch solves the protection fault in tun driver, by the fact that now we are paying attention to the link_mode value only if the driver confirmed it supports the link_mode and set it properly.
Andrew Lunn April 4, 2021, 4:07 p.m. UTC | #3
> First, it is not the API structure that is passed to user space. Please pay attention

Yes, sorry. Jumped to assumptions without checking.

Let me try again.

    Andrew
Danielle Ratson April 5, 2021, 5:01 p.m. UTC | #4
> -----Original Message-----

> From: Andrew Lunn <andrew@lunn.ch>

> Sent: Sunday, April 4, 2021 7:33 PM

> To: Danielle Ratson <danieller@nvidia.com>

> Cc: netdev@vger.kernel.org; davem@davemloft.net; kuba@kernel.org; eric.dumazet@gmail.com; mkubecek@suse.cz;

> f.fainelli@gmail.com; acardace@redhat.com; irusskikh@marvell.com; gustavo@embeddedor.com; magnus.karlsson@intel.com;

> ecree@solarflare.com; Ido Schimmel <idosch@nvidia.com>; Jiri Pirko <jiri@nvidia.com>; mlxsw <mlxsw@nvidia.com>

> Subject: Re: [PATCH net v2 1/2] ethtool: Add link_mode parameter capability bit to ethtool_ops

> 

> > @@ -436,12 +436,16 @@ int __ethtool_get_link_ksettings(struct

> > net_device *dev,

> >

> >  	memset(link_ksettings, 0, sizeof(*link_ksettings));

> >

> > -	link_ksettings->link_mode = -1;

> >  	err = dev->ethtool_ops->get_link_ksettings(dev, link_ksettings);

> >  	if (err)

> >  		return err;

> >

> > -	if (link_ksettings->link_mode != -1) {

> > +	if (dev->ethtool_ops->cap_link_mode_supported &&

> > +	    link_ksettings->link_mode != -1) {

> 

> Is this -1 behaviour documented anywhere? It seems like you just changed its meaning. It used to mean, this field has not been set,

> ignore it. Adding the cap_link_mode_supported it now means, we have field has been set, but we have no idea what link mode is

> being used.

> So you should probably add something to the documentation of struct ethtool_link_ksettings.

> 

> I wonder if we should actually add ETHTOOL_LINK_MODE_UNKNOWN to enum ethtool_link_mode_bit_indices?

> 

> > +		if (WARN_ON_ONCE(link_ksettings->link_mode >=

> > +				 __ETHTOOL_LINK_MODE_MASK_NBITS))

> > +			return -EINVAL;

> > +

> >  		link_info = &link_mode_params[link_ksettings->link_mode];

> >  		link_ksettings->base.speed = link_info->speed;

> >  		link_ksettings->lanes = link_info->lanes;

> 

> If dev->ethtool_ops->cap_link_mode_supported && link_ksettings->link_mode == -1 should you be setting speed to

> SPEED_UNKNOWN, and lanes to LANE_UNKNOWN? Or is that already the default?

> 

> But over all, this API between the core and the driver seems messy. Why not just add a helper in common.c which translates link

> mode to speed/duplex/lanes and call it in the driver. Then you don't need this capability flags, which i doubt any other driver will ever

> use. And you don't need to worry about drivers returning random values. As far as i can see, the link_mode returned by the driver is

> not used for anything other than for this translation. So i don't see a need for it outside of the driver. Or maybe i'm missing

> something?

> 

> 	Andrew


Hi Andrew,

Please see my patch here: https://github.com/daniellerts/linux_mlxsw/commit/72ca614951418843aa87323630c354691d9e50d4
Since a lack of time until the window closes, please see if that is what you meant and you are ok with it, before I am sending another version.

Thanks,
Danielle