diff mbox series

[RFC,net-next,6/7] net: dsa: mv88e6xxx: provide software node for default settings

Message ID E1pex8f-00Dvo9-KT@rmk-PC.armlinux.org.uk
State New
Headers show
Series Another attempt at moving mv88e6xxx forward | expand

Commit Message

Russell King (Oracle) March 22, 2023, noon UTC
Provide a software node for default settings when no phy, sfp or
fixed link is specified.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 109 +++++++++++++++++++++++++++++++
 1 file changed, 109 insertions(+)

Comments

Andrew Lunn March 22, 2023, 8:17 p.m. UTC | #1
On Wed, Mar 22, 2023 at 08:13:51PM +0000, Russell King (Oracle) wrote:
> On Wed, Mar 22, 2023 at 07:57:19PM +0100, Andrew Lunn wrote:
> > > +static struct fwnode_handle *mv88e6xxx_port_get_fwnode(struct dsa_switch *ds,
> > > +						       int port,
> > > +						       struct fwnode_handle *h)
> > > +{
> > > +	struct mv88e6xxx_chip *chip = ds->priv;
> > > +	struct device_node *np, *phy_node;
> > > +	int speed, duplex, err;
> > > +	phy_interface_t mode;
> > > +	struct dsa_port *dp;
> > > +	unsigned long caps;
> > > +
> > > +	dp = dsa_to_port(ds, port);
> > > +	if (dsa_port_is_user(dp))
> > > +		return h;
> > > +
> > > +	/* No DT? Eh? */
> > > +	np = to_of_node(h);
> > > +	if (!np)
> > > +		return h;
> > 
> > I've not looked at the big picture yet, but you can have a simple
> > switch setup without DT. I have a couple of amd64 boards which use
> > platform data. The user ports all have internal PHYs, and the CPU port
> > defaults to 1G, it might even be strapped that way.
> 
> Are you suggesting that we should generate some swnode description of
> the max interface mode and speed if we are missing a DT node?
> 
> I'm not seeing any port specific data in the mv88e6xxx platform data.

No, i'm just pointing out that not have DT is not an error, and can
happen. I just wanted to make sure you are not assuming there is
always DT.

	Andrew
Andrew Lunn March 22, 2023, 9:40 p.m. UTC | #2
> What I'm trying to find out is what you think the behaviour should be
> in this case. Are you suggesting we should fall back to what we do now
> which is let the driver do it internally without phylink.
> 
> The problem is that if we don't go down the phylink route for everything
> then we /can't/ convert mv88e6xxx to phylink_pcs, because the "serdes"
> stuff will be gone, and the absence of phylink will mean those won't be
> called e.g. to power up the serdes.

I'm pretty sure non-DT systems have never used SERDES. They are using
a back to back PHY, or maybe RGMII. So long as this keeps working, we
can convert to phylink.

And i have such a amd64 system, using back to back PHYs so i can test
it does not regress.

    Andrew
Russell King (Oracle) March 23, 2023, 6:25 p.m. UTC | #3
On Thu, Mar 23, 2023 at 07:17:27PM +0100, Andrew Lunn wrote:
> > So, given that this is only supposed to be used for mv88e6xxx because
> > of it's legacy, maybe the check in dsa_port_phylink_create() should
> > be:
> > 
> >         fwnode = of_fwnode_handle(dp->dn);
> >         if (fwnode && ds->ops->port_get_fwnode) {
> > 
> > In other words, we only allow the replacement of the firmware
> > description if one already existed.
> 
> That sounds reasonable.
> 
> > Alternatively, we could use:
> > 
> > 	if (!dsa_port_is_user(dp) && ds->ops->port_get_fwnode) {
> > 
> > since mv88e6xxx today only does this "max speed" thing for CPU and
> > DSA ports, and thus we only need to replace the firmware description
> > for these ports - and we can document that port_get_fwnode is only
> > for CPU and DSA ports.
> 
> Also reasonable.
> 
> The first seems better for the Non-DT, where as the second makes it
> clear it is supposed to be for CPU and DSA ports only.
> 
> Is it over the top to combine them?

To be clear, you're suggesting:

	if (!dsa_port_is_user(dp) && fwnode && ds->ops->port_get_fwnode) {

?

If so, yes - you know better than I how these bits are supposed to work.
Thanks.
Heikki Krogerus March 24, 2023, 2:49 p.m. UTC | #4
Hi Russell,

On Wed, Mar 22, 2023 at 12:00:21PM +0000, Russell King (Oracle) wrote:
> +static struct fwnode_handle *mv88e6xxx_create_fixed_swnode(struct fwnode_handle *parent,
> +							   int speed,
> +							   int duplex)
> +{
> +	struct property_entry fixed_link_props[3] = { };
> +
> +	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> +	if (duplex == DUPLEX_FULL)
> +		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> +
> +	return fwnode_create_named_software_node(fixed_link_props, parent,
> +						 "fixed-link");
> +}
> +
> +static struct fwnode_handle *mv88e6xxx_create_port_swnode(phy_interface_t mode,
> +							  int speed,
> +							  int duplex)
> +{
> +	struct property_entry port_props[2] = {};
> +	struct fwnode_handle *fixed_link_fwnode;
> +	struct fwnode_handle *new_port_fwnode;
> +
> +	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> +	new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> +	if (IS_ERR(new_port_fwnode))
> +		return new_port_fwnode;
> +
> +	fixed_link_fwnode = mv88e6xxx_create_fixed_swnode(new_port_fwnode,
> +							  speed, duplex);
> +	if (IS_ERR(fixed_link_fwnode)) {
> +		fwnode_remove_software_node(new_port_fwnode);
> +		return fixed_link_fwnode;
> +	}
> +
> +	return new_port_fwnode;
> +}

That new fwnode_create_named_software_node() function looks like a
conflict waiting to happen - if a driver adds a node to the root level
(does not have to be root level), all the tests will pass because
there is only a single device, but when a user later tries the driver
with two devices, it fails, because the node already exist. But you
don't need that function at all.

Here's an example how you can add the nodes with the already existing
APIs. To keep this example simple, I'm expecting that you have members
for the software nodes to the struct mv88e6xxx_chip:

struct mv88e6xxx_chip {
        ...
        /* swnodes */
        struct software_node port_swnode;
        struct software_node fixed_link_swnode;
};

Of course, you don't have to add those members if you don't want to.
Then you just need to allocate the nodes separately, but that should
not be a problem. In any case, something like this:

static struct fwnode_handle *mv88e6xxx_create_port_swnode(struct mv88e6xxx_chip *chip,
                                                          phy_interface_t mode,
							  int speed,
							  int duplex)
{
	struct property_entry fixed_link_props[3] = { };
	struct property_entry port_props[2] = { };
	int ret;

        /*
         * First register the port node.
         */
	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));

	chip->port_swnode.properties = property_entries_dup(port_props);
        if (IS_ERR(chip->port_swnode.properties))
                return ERR_CAST(chip->port_swnode.properties);

	ret = software_node_register(&chip->port_swnode);
	if (ret) {
                kfree(chip->port_swnode.properties);
		return ERR_PTR(ret);
        }

        /*
         * Then the second node, child of the port node.
         */
	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
	if (duplex == DUPLEX_FULL)
		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");

        chip->fixed_link_swnode.name = "fixed-link";
        chip->fixed_link_swnode.parent = &chip->port_swnode;
	chip->fixed_link_swnode.properties = property_entries_dup(fixed_link_props);
        if (IS_ERR(chip->port_swnode.properties)) {
                software_node_unregister(&chip->port_swnode);
                kfree(chip->port_swnode.properties);
                return ERR_CAST(chip->fixed_link_swnode.properties);
        }

	ret = software_node_register(&chip->fixed_link_swnode);
        if (ret) {
                software_node_unregister(&chip->port_swnode);
                kfree(chip->port_swnode.properties);
                kfree(chip->fixed_link_swnode.properties);
		return ERR_PTR(ret);
        }

        /*
         * Finally, return the port fwnode.
         */
        return software_node_fwnode(&chip->port_swnode);
}

thanks,
Russell King (Oracle) March 24, 2023, 5:04 p.m. UTC | #5
On Fri, Mar 24, 2023 at 04:49:32PM +0200, Heikki Krogerus wrote:
> Hi Russell,
> 
> On Wed, Mar 22, 2023 at 12:00:21PM +0000, Russell King (Oracle) wrote:
> > +static struct fwnode_handle *mv88e6xxx_create_fixed_swnode(struct fwnode_handle *parent,
> > +							   int speed,
> > +							   int duplex)
> > +{
> > +	struct property_entry fixed_link_props[3] = { };
> > +
> > +	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> > +	if (duplex == DUPLEX_FULL)
> > +		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> > +
> > +	return fwnode_create_named_software_node(fixed_link_props, parent,
> > +						 "fixed-link");
> > +}
> > +
> > +static struct fwnode_handle *mv88e6xxx_create_port_swnode(phy_interface_t mode,
> > +							  int speed,
> > +							  int duplex)
> > +{
> > +	struct property_entry port_props[2] = {};
> > +	struct fwnode_handle *fixed_link_fwnode;
> > +	struct fwnode_handle *new_port_fwnode;
> > +
> > +	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> > +	new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> > +	if (IS_ERR(new_port_fwnode))
> > +		return new_port_fwnode;
> > +
> > +	fixed_link_fwnode = mv88e6xxx_create_fixed_swnode(new_port_fwnode,
> > +							  speed, duplex);
> > +	if (IS_ERR(fixed_link_fwnode)) {
> > +		fwnode_remove_software_node(new_port_fwnode);
> > +		return fixed_link_fwnode;
> > +	}
> > +
> > +	return new_port_fwnode;
> > +}
> 
> That new fwnode_create_named_software_node() function looks like a
> conflict waiting to happen - if a driver adds a node to the root level
> (does not have to be root level), all the tests will pass because
> there is only a single device, but when a user later tries the driver
> with two devices, it fails, because the node already exist. But you
> don't need that function at all.

I think you're totally failing to explain how this can fail.

Let me reiterate what thestructure of the swnodes here is:

	root
	`- node%d (%d allocated by root IDA)
	   +- phy-mode property
	   `- fixed-link
	      +- speed property
	      `- optional full-duplex property

If we have two different devices creating these nodes, then at the
root level, they will end up having different root names. The
"fixed-link" is a child of this node.

swnode already allows multiple identical names at the sub-node
level - each node ends up with its own IDA to allocate the generic
"node%d" names from. So as soon as we have multiple nodes, they
end up as this:

	root
	+- node0
	|  `- node 0
	+- node1
	|  `- node 0
	+- node2
	|  `- node 0
	etc

So, if we end up with two devices creating these at the same time,
we end up with:

	root
	+- nodeA (A allocated by root IDA)
	|  +- phy-mode property
	|  `- fixed-link
	|     +- speed property
	|     `- optional full-duplex property
	`- nodeB (B allocated by root IDA, different from above)
	   +- phy-mode property
	   `- fixed-link
	      +- speed property
	      `- optional full-duplex property

Since the kobject is parented to the parent's kobject, what we
end up with in sysfs is:

	.../nodeA/fixed-link/speed
	.../nodeB/fixed-link/speed

Thus, the "fixed-link" ndoes can _not_ conflict.

Please explain in detail where you think the conflict is, because
so far no one has been able to counter my assertions that this is
_safe_ with a proper full technical description of the problem.
All I get is hand-wavey "this conflicts".

Honestly, I'm getting sick of poor quality reviews... the next
poor review that claims there's a conflict here without properly
explain it will be told where to go.
Heikki Krogerus March 27, 2023, 10:28 a.m. UTC | #6
On Fri, Mar 24, 2023 at 05:04:25PM +0000, Russell King (Oracle) wrote:
> On Fri, Mar 24, 2023 at 04:49:32PM +0200, Heikki Krogerus wrote:
> > Hi Russell,
> > 
> > On Wed, Mar 22, 2023 at 12:00:21PM +0000, Russell King (Oracle) wrote:
> > > +static struct fwnode_handle *mv88e6xxx_create_fixed_swnode(struct fwnode_handle *parent,
> > > +							   int speed,
> > > +							   int duplex)
> > > +{
> > > +	struct property_entry fixed_link_props[3] = { };
> > > +
> > > +	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> > > +	if (duplex == DUPLEX_FULL)
> > > +		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> > > +
> > > +	return fwnode_create_named_software_node(fixed_link_props, parent,
> > > +						 "fixed-link");
> > > +}
> > > +
> > > +static struct fwnode_handle *mv88e6xxx_create_port_swnode(phy_interface_t mode,
> > > +							  int speed,
> > > +							  int duplex)
> > > +{
> > > +	struct property_entry port_props[2] = {};
> > > +	struct fwnode_handle *fixed_link_fwnode;
> > > +	struct fwnode_handle *new_port_fwnode;
> > > +
> > > +	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> > > +	new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> > > +	if (IS_ERR(new_port_fwnode))
> > > +		return new_port_fwnode;
> > > +
> > > +	fixed_link_fwnode = mv88e6xxx_create_fixed_swnode(new_port_fwnode,
> > > +							  speed, duplex);
> > > +	if (IS_ERR(fixed_link_fwnode)) {
> > > +		fwnode_remove_software_node(new_port_fwnode);
> > > +		return fixed_link_fwnode;
> > > +	}
> > > +
> > > +	return new_port_fwnode;
> > > +}
> > 
> > That new fwnode_create_named_software_node() function looks like a
> > conflict waiting to happen - if a driver adds a node to the root level
> > (does not have to be root level), all the tests will pass because
> > there is only a single device, but when a user later tries the driver
> > with two devices, it fails, because the node already exist. But you
> > don't need that function at all.
> 
> I think you're totally failing to explain how this can fail.
> 
> Let me reiterate what thestructure of the swnodes here is:
> 
> 	root
> 	`- node%d (%d allocated by root IDA)
> 	   +- phy-mode property
> 	   `- fixed-link
> 	      +- speed property
> 	      `- optional full-duplex property
> 
> If we have two different devices creating these nodes, then at the
> root level, they will end up having different root names. The
> "fixed-link" is a child of this node.

Ah, sorry, the problem is not with this patch, or your use case. The
problem is with the PATCH 1/7 of this series where you introduce that
new function fwnode_create_named_software_node() which will not be
tied to your use case only. In this patch you just use that function.
I should have been more clear on that.

I really just wanted to show how you can create those nodes by using
the API designed for the statically described software nodes. So you
don't need that new function. Please check that proposal from my
original reply.

If the potential conflict that the new function creates is still not
clear, then - firstly, you have to remember that that API is not only
for your drivers, it's generic API! - the problem comes from the fact
that there simply is nothing preventing it from being used to place
the new nodes always at the same level. So for example using NULL as
the parent:

        fwnode = fwnode_create_named_software_node(props,
                                                   NULL, /* NOTE */
                                                   "same_name");

thanks,
Russell King (Oracle) March 27, 2023, 10:55 a.m. UTC | #7
On Mon, Mar 27, 2023 at 01:28:06PM +0300, Heikki Krogerus wrote:
> On Fri, Mar 24, 2023 at 05:04:25PM +0000, Russell King (Oracle) wrote:
> > On Fri, Mar 24, 2023 at 04:49:32PM +0200, Heikki Krogerus wrote:
> > > Hi Russell,
> > > 
> > > On Wed, Mar 22, 2023 at 12:00:21PM +0000, Russell King (Oracle) wrote:
> > > > +static struct fwnode_handle *mv88e6xxx_create_fixed_swnode(struct fwnode_handle *parent,
> > > > +							   int speed,
> > > > +							   int duplex)
> > > > +{
> > > > +	struct property_entry fixed_link_props[3] = { };
> > > > +
> > > > +	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> > > > +	if (duplex == DUPLEX_FULL)
> > > > +		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> > > > +
> > > > +	return fwnode_create_named_software_node(fixed_link_props, parent,
> > > > +						 "fixed-link");
> > > > +}
> > > > +
> > > > +static struct fwnode_handle *mv88e6xxx_create_port_swnode(phy_interface_t mode,
> > > > +							  int speed,
> > > > +							  int duplex)
> > > > +{
> > > > +	struct property_entry port_props[2] = {};
> > > > +	struct fwnode_handle *fixed_link_fwnode;
> > > > +	struct fwnode_handle *new_port_fwnode;
> > > > +
> > > > +	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> > > > +	new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> > > > +	if (IS_ERR(new_port_fwnode))
> > > > +		return new_port_fwnode;
> > > > +
> > > > +	fixed_link_fwnode = mv88e6xxx_create_fixed_swnode(new_port_fwnode,
> > > > +							  speed, duplex);
> > > > +	if (IS_ERR(fixed_link_fwnode)) {
> > > > +		fwnode_remove_software_node(new_port_fwnode);
> > > > +		return fixed_link_fwnode;
> > > > +	}
> > > > +
> > > > +	return new_port_fwnode;
> > > > +}
> > > 
> > > That new fwnode_create_named_software_node() function looks like a
> > > conflict waiting to happen - if a driver adds a node to the root level
> > > (does not have to be root level), all the tests will pass because
> > > there is only a single device, but when a user later tries the driver
> > > with two devices, it fails, because the node already exist. But you
> > > don't need that function at all.
> > 
> > I think you're totally failing to explain how this can fail.
> > 
> > Let me reiterate what thestructure of the swnodes here is:
> > 
> > 	root
> > 	`- node%d (%d allocated by root IDA)
> > 	   +- phy-mode property
> > 	   `- fixed-link
> > 	      +- speed property
> > 	      `- optional full-duplex property
> > 
> > If we have two different devices creating these nodes, then at the
> > root level, they will end up having different root names. The
> > "fixed-link" is a child of this node.
> 
> Ah, sorry, the problem is not with this patch, or your use case. The
> problem is with the PATCH 1/7 of this series where you introduce that
> new function fwnode_create_named_software_node() which will not be
> tied to your use case only. In this patch you just use that function.
> I should have been more clear on that.

How is this any different from creating two struct device's with the
same parent and the same name? Or kobject_add() with the same parent
and name?

> I really just wanted to show how you can create those nodes by using
> the API designed for the statically described software nodes. So you
> don't need that new function. Please check that proposal from my
> original reply.

I don't see why I should. This is clearly a case that if one creates
two named nodes with the same name and same parent, it should fail and
it's definitely a "well don't do that then" in just the same way that
one doesn't do it with kobject_add() or any of the other numerous
interfaces that take names in a space that need to be unique.

I really don't think there is any issue here to be solved. In fact,
I think solving it will add additional useless complexity that just
isn't required - which adds extra code that can be wrong and fail.

Let's keep this simple. This approach is simple. If one does something
stupid (like creating two named nodes with the same name and same
parent) then it will verbosely fail. That is a good thing.

Internal kernel APIs are not supposed to protect people from being
stupid.
Heikki Krogerus March 27, 2023, 2:13 p.m. UTC | #8
Hi Russell,

On Mon, Mar 27, 2023 at 11:55:00AM +0100, Russell King (Oracle) wrote:
> On Mon, Mar 27, 2023 at 01:28:06PM +0300, Heikki Krogerus wrote:
> > On Fri, Mar 24, 2023 at 05:04:25PM +0000, Russell King (Oracle) wrote:
> > > On Fri, Mar 24, 2023 at 04:49:32PM +0200, Heikki Krogerus wrote:
> > > > Hi Russell,
> > > > 
> > > > On Wed, Mar 22, 2023 at 12:00:21PM +0000, Russell King (Oracle) wrote:
> > > > > +static struct fwnode_handle *mv88e6xxx_create_fixed_swnode(struct fwnode_handle *parent,
> > > > > +							   int speed,
> > > > > +							   int duplex)
> > > > > +{
> > > > > +	struct property_entry fixed_link_props[3] = { };
> > > > > +
> > > > > +	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> > > > > +	if (duplex == DUPLEX_FULL)
> > > > > +		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> > > > > +
> > > > > +	return fwnode_create_named_software_node(fixed_link_props, parent,
> > > > > +						 "fixed-link");
> > > > > +}
> > > > > +
> > > > > +static struct fwnode_handle *mv88e6xxx_create_port_swnode(phy_interface_t mode,
> > > > > +							  int speed,
> > > > > +							  int duplex)
> > > > > +{
> > > > > +	struct property_entry port_props[2] = {};
> > > > > +	struct fwnode_handle *fixed_link_fwnode;
> > > > > +	struct fwnode_handle *new_port_fwnode;
> > > > > +
> > > > > +	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> > > > > +	new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> > > > > +	if (IS_ERR(new_port_fwnode))
> > > > > +		return new_port_fwnode;
> > > > > +
> > > > > +	fixed_link_fwnode = mv88e6xxx_create_fixed_swnode(new_port_fwnode,
> > > > > +							  speed, duplex);
> > > > > +	if (IS_ERR(fixed_link_fwnode)) {
> > > > > +		fwnode_remove_software_node(new_port_fwnode);
> > > > > +		return fixed_link_fwnode;
> > > > > +	}
> > > > > +
> > > > > +	return new_port_fwnode;
> > > > > +}
> > > > 
> > > > That new fwnode_create_named_software_node() function looks like a
> > > > conflict waiting to happen - if a driver adds a node to the root level
> > > > (does not have to be root level), all the tests will pass because
> > > > there is only a single device, but when a user later tries the driver
> > > > with two devices, it fails, because the node already exist. But you
> > > > don't need that function at all.
> > > 
> > > I think you're totally failing to explain how this can fail.
> > > 
> > > Let me reiterate what thestructure of the swnodes here is:
> > > 
> > > 	root
> > > 	`- node%d (%d allocated by root IDA)
> > > 	   +- phy-mode property
> > > 	   `- fixed-link
> > > 	      +- speed property
> > > 	      `- optional full-duplex property
> > > 
> > > If we have two different devices creating these nodes, then at the
> > > root level, they will end up having different root names. The
> > > "fixed-link" is a child of this node.
> > 
> > Ah, sorry, the problem is not with this patch, or your use case. The
> > problem is with the PATCH 1/7 of this series where you introduce that
> > new function fwnode_create_named_software_node() which will not be
> > tied to your use case only. In this patch you just use that function.
> > I should have been more clear on that.
> 
> How is this any different from creating two struct device's with the
> same parent and the same name? Or kobject_add() with the same parent
> and name?

But that can not mean we have to take the same risk everywhere. I do
understand that we don't protect developers from doing silly decisions
in kernel, but that does not mean that we should simply accept
interfaces into the kernel that expose these risk if we don't need
them.

> > I really just wanted to show how you can create those nodes by using
> > the API designed for the statically described software nodes. So you
> > don't need that new function. Please check that proposal from my
> > original reply.
> 
> I don't see why I should. This is clearly a case that if one creates
> two named nodes with the same name and same parent, it should fail and
> it's definitely a "well don't do that then" in just the same way that
> one doesn't do it with kobject_add() or any of the other numerous
> interfaces that take names in a space that need to be unique.
> 
> I really don't think there is any issue here to be solved. In fact,
> I think solving it will add additional useless complexity that just
> isn't required - which adds extra code that can be wrong and fail.
> 
> Let's keep this simple. This approach is simple. If one does something
> stupid (like creating two named nodes with the same name and same
> parent) then it will verbosely fail. That is a good thing.

Well, I think the most simplest approach would be to have both the
nodes and the properties as part of that struct mv88e6xxx_chip:

struct mv88e6xxx_chip {
        ...
       struct property_entry port_props[2];
       struct property_entry fixed_link_props[3];

       struct software_node port_swnode;
       struct software_node fixed_link_swnode;
};

That allows you to register both nodes in one go:

static struct fwnode_handle *mv88e6xxx_create_port_swnode(struct mv88e6xxx_chip *chip,
                                                          phy_interface_t mode,
                                                          int speed,
                                                          int duplex)
{
        struct software_node *nodes[3] = {
                &chip->port_swnode,
                &chip->fixed_link_swnode,
        };
        int ret;

        chip->port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
        chip->port_swnode.properties = chip->port_props;

        chip->fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
        if (duplex == DUPLEX_FULL)
                chip->fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");

        chip->fixed_link_swnode.name = "fixed-link";
        chip->fixed_link_swnode.parent = &chip->port_swnode;
        chip->fixed_link_swnode.properties = chip->fixed_link_props;

        ret = software_node_register_node_group(nodes);
        if (ret)
                return ERR_PTR(ret);

        return software_node_fwnode(&chip->port_swnode);
}

> Internal kernel APIs are not supposed to protect people from being
> stupid.

This is an interesting topic. I used to agree with this
idea/philosophy without much thought, but then after (years of :-)
listening maintainers complaining about how new developers always
repeat the same mistakes over an over again, I've started thinking
about it. To use a bit rough analog, if we give a gun to a monkey, how
can we be surprised if if ends up shooting first its mates and then
its own brains out...

Perhaps this is a more generic subject, a topic for some conference
maybe, but I in any case don't think this is a black and white matter.
A little bit of protection is not only a harmful thing and always only
in the way of flexibility.

At the very least, I don't think we should not use this philosophy as
an argument for doing things in ways that may expose even minute risks
in the cases like this were an alternative approach already exists.

Br,
Russell King (Oracle) March 27, 2023, 2:32 p.m. UTC | #9
On Mon, Mar 27, 2023 at 05:13:25PM +0300, Heikki Krogerus wrote:
> Hi Russell,
> 
> On Mon, Mar 27, 2023 at 11:55:00AM +0100, Russell King (Oracle) wrote:
> > On Mon, Mar 27, 2023 at 01:28:06PM +0300, Heikki Krogerus wrote:
> > > On Fri, Mar 24, 2023 at 05:04:25PM +0000, Russell King (Oracle) wrote:
> > > > On Fri, Mar 24, 2023 at 04:49:32PM +0200, Heikki Krogerus wrote:
> > > > > Hi Russell,
> > > > > 
> > > > > On Wed, Mar 22, 2023 at 12:00:21PM +0000, Russell King (Oracle) wrote:
> > > > > > +static struct fwnode_handle *mv88e6xxx_create_fixed_swnode(struct fwnode_handle *parent,
> > > > > > +							   int speed,
> > > > > > +							   int duplex)
> > > > > > +{
> > > > > > +	struct property_entry fixed_link_props[3] = { };
> > > > > > +
> > > > > > +	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> > > > > > +	if (duplex == DUPLEX_FULL)
> > > > > > +		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> > > > > > +
> > > > > > +	return fwnode_create_named_software_node(fixed_link_props, parent,
> > > > > > +						 "fixed-link");
> > > > > > +}
> > > > > > +
> > > > > > +static struct fwnode_handle *mv88e6xxx_create_port_swnode(phy_interface_t mode,
> > > > > > +							  int speed,
> > > > > > +							  int duplex)
> > > > > > +{
> > > > > > +	struct property_entry port_props[2] = {};
> > > > > > +	struct fwnode_handle *fixed_link_fwnode;
> > > > > > +	struct fwnode_handle *new_port_fwnode;
> > > > > > +
> > > > > > +	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> > > > > > +	new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> > > > > > +	if (IS_ERR(new_port_fwnode))
> > > > > > +		return new_port_fwnode;
> > > > > > +
> > > > > > +	fixed_link_fwnode = mv88e6xxx_create_fixed_swnode(new_port_fwnode,
> > > > > > +							  speed, duplex);
> > > > > > +	if (IS_ERR(fixed_link_fwnode)) {
> > > > > > +		fwnode_remove_software_node(new_port_fwnode);
> > > > > > +		return fixed_link_fwnode;
> > > > > > +	}
> > > > > > +
> > > > > > +	return new_port_fwnode;
> > > > > > +}
> > > > > 
> > > > > That new fwnode_create_named_software_node() function looks like a
> > > > > conflict waiting to happen - if a driver adds a node to the root level
> > > > > (does not have to be root level), all the tests will pass because
> > > > > there is only a single device, but when a user later tries the driver
> > > > > with two devices, it fails, because the node already exist. But you
> > > > > don't need that function at all.
> > > > 
> > > > I think you're totally failing to explain how this can fail.
> > > > 
> > > > Let me reiterate what thestructure of the swnodes here is:
> > > > 
> > > > 	root
> > > > 	`- node%d (%d allocated by root IDA)
> > > > 	   +- phy-mode property
> > > > 	   `- fixed-link
> > > > 	      +- speed property
> > > > 	      `- optional full-duplex property
> > > > 
> > > > If we have two different devices creating these nodes, then at the
> > > > root level, they will end up having different root names. The
> > > > "fixed-link" is a child of this node.
> > > 
> > > Ah, sorry, the problem is not with this patch, or your use case. The
> > > problem is with the PATCH 1/7 of this series where you introduce that
> > > new function fwnode_create_named_software_node() which will not be
> > > tied to your use case only. In this patch you just use that function.
> > > I should have been more clear on that.
> > 
> > How is this any different from creating two struct device's with the
> > same parent and the same name? Or kobject_add() with the same parent
> > and name?
> 
> But that can not mean we have to take the same risk everywhere. I do
> understand that we don't protect developers from doing silly decisions
> in kernel, but that does not mean that we should simply accept
> interfaces into the kernel that expose these risk if we don't need
> them.
> 
> > > I really just wanted to show how you can create those nodes by using
> > > the API designed for the statically described software nodes. So you
> > > don't need that new function. Please check that proposal from my
> > > original reply.
> > 
> > I don't see why I should. This is clearly a case that if one creates
> > two named nodes with the same name and same parent, it should fail and
> > it's definitely a "well don't do that then" in just the same way that
> > one doesn't do it with kobject_add() or any of the other numerous
> > interfaces that take names in a space that need to be unique.
> > 
> > I really don't think there is any issue here to be solved. In fact,
> > I think solving it will add additional useless complexity that just
> > isn't required - which adds extra code that can be wrong and fail.
> > 
> > Let's keep this simple. This approach is simple. If one does something
> > stupid (like creating two named nodes with the same name and same
> > parent) then it will verbosely fail. That is a good thing.
> 
> Well, I think the most simplest approach would be to have both the
> nodes and the properties as part of that struct mv88e6xxx_chip:
> 
> struct mv88e6xxx_chip {
>         ...
>        struct property_entry port_props[2];
>        struct property_entry fixed_link_props[3];
> 
>        struct software_node port_swnode;
>        struct software_node fixed_link_swnode;
> };
> 
> That allows you to register both nodes in one go:
> 
> static struct fwnode_handle *mv88e6xxx_create_port_swnode(struct mv88e6xxx_chip *chip,
>                                                           phy_interface_t mode,
>                                                           int speed,
>                                                           int duplex)
> {
>         struct software_node *nodes[3] = {
>                 &chip->port_swnode,
>                 &chip->fixed_link_swnode,
>         };
>         int ret;
> 
>         chip->port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
>         chip->port_swnode.properties = chip->port_props;
> 
>         chip->fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
>         if (duplex == DUPLEX_FULL)
>                 chip->fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> 
>         chip->fixed_link_swnode.name = "fixed-link";
>         chip->fixed_link_swnode.parent = &chip->port_swnode;
>         chip->fixed_link_swnode.properties = chip->fixed_link_props;
> 
>         ret = software_node_register_node_group(nodes);
>         if (ret)
>                 return ERR_PTR(ret);
> 
>         return software_node_fwnode(&chip->port_swnode);
> }

You're suggesting code that passes a fwnode pointer back up layers
that has been allocated in the driver's private structure, assuming
that those upper layers are going to release this before re-calling
this function for a different port. They do today, but in the future?

There are always plenty of guns...

If we don't want to give the monkey the gun, we need a more complex
solution than that... and it becomes a question about how far you
want to take gun control.

Then there's the question about why we should have this data allocated
permanently in the system when it is only used for a very short period
during driver initialisation. That seems to be a complete waste of
resources.
Russell King (Oracle) March 27, 2023, 3:45 p.m. UTC | #10
On Mon, Mar 27, 2023 at 03:32:46PM +0100, Russell King (Oracle) wrote:
> On Mon, Mar 27, 2023 at 05:13:25PM +0300, Heikki Krogerus wrote:
> > Hi Russell,
> > 
> > On Mon, Mar 27, 2023 at 11:55:00AM +0100, Russell King (Oracle) wrote:
> > > On Mon, Mar 27, 2023 at 01:28:06PM +0300, Heikki Krogerus wrote:
> > > > On Fri, Mar 24, 2023 at 05:04:25PM +0000, Russell King (Oracle) wrote:
> > > > > On Fri, Mar 24, 2023 at 04:49:32PM +0200, Heikki Krogerus wrote:
> > > > > > Hi Russell,
> > > > > > 
> > > > > > On Wed, Mar 22, 2023 at 12:00:21PM +0000, Russell King (Oracle) wrote:
> > > > > > > +static struct fwnode_handle *mv88e6xxx_create_fixed_swnode(struct fwnode_handle *parent,
> > > > > > > +							   int speed,
> > > > > > > +							   int duplex)
> > > > > > > +{
> > > > > > > +	struct property_entry fixed_link_props[3] = { };
> > > > > > > +
> > > > > > > +	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> > > > > > > +	if (duplex == DUPLEX_FULL)
> > > > > > > +		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> > > > > > > +
> > > > > > > +	return fwnode_create_named_software_node(fixed_link_props, parent,
> > > > > > > +						 "fixed-link");
> > > > > > > +}
> > > > > > > +
> > > > > > > +static struct fwnode_handle *mv88e6xxx_create_port_swnode(phy_interface_t mode,
> > > > > > > +							  int speed,
> > > > > > > +							  int duplex)
> > > > > > > +{
> > > > > > > +	struct property_entry port_props[2] = {};
> > > > > > > +	struct fwnode_handle *fixed_link_fwnode;
> > > > > > > +	struct fwnode_handle *new_port_fwnode;
> > > > > > > +
> > > > > > > +	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> > > > > > > +	new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> > > > > > > +	if (IS_ERR(new_port_fwnode))
> > > > > > > +		return new_port_fwnode;
> > > > > > > +
> > > > > > > +	fixed_link_fwnode = mv88e6xxx_create_fixed_swnode(new_port_fwnode,
> > > > > > > +							  speed, duplex);
> > > > > > > +	if (IS_ERR(fixed_link_fwnode)) {
> > > > > > > +		fwnode_remove_software_node(new_port_fwnode);
> > > > > > > +		return fixed_link_fwnode;
> > > > > > > +	}
> > > > > > > +
> > > > > > > +	return new_port_fwnode;
> > > > > > > +}
> > > > > > 
> > > > > > That new fwnode_create_named_software_node() function looks like a
> > > > > > conflict waiting to happen - if a driver adds a node to the root level
> > > > > > (does not have to be root level), all the tests will pass because
> > > > > > there is only a single device, but when a user later tries the driver
> > > > > > with two devices, it fails, because the node already exist. But you
> > > > > > don't need that function at all.
> > > > > 
> > > > > I think you're totally failing to explain how this can fail.
> > > > > 
> > > > > Let me reiterate what thestructure of the swnodes here is:
> > > > > 
> > > > > 	root
> > > > > 	`- node%d (%d allocated by root IDA)
> > > > > 	   +- phy-mode property
> > > > > 	   `- fixed-link
> > > > > 	      +- speed property
> > > > > 	      `- optional full-duplex property
> > > > > 
> > > > > If we have two different devices creating these nodes, then at the
> > > > > root level, they will end up having different root names. The
> > > > > "fixed-link" is a child of this node.
> > > > 
> > > > Ah, sorry, the problem is not with this patch, or your use case. The
> > > > problem is with the PATCH 1/7 of this series where you introduce that
> > > > new function fwnode_create_named_software_node() which will not be
> > > > tied to your use case only. In this patch you just use that function.
> > > > I should have been more clear on that.
> > > 
> > > How is this any different from creating two struct device's with the
> > > same parent and the same name? Or kobject_add() with the same parent
> > > and name?
> > 
> > But that can not mean we have to take the same risk everywhere. I do
> > understand that we don't protect developers from doing silly decisions
> > in kernel, but that does not mean that we should simply accept
> > interfaces into the kernel that expose these risk if we don't need
> > them.
> > 
> > > > I really just wanted to show how you can create those nodes by using
> > > > the API designed for the statically described software nodes. So you
> > > > don't need that new function. Please check that proposal from my
> > > > original reply.
> > > 
> > > I don't see why I should. This is clearly a case that if one creates
> > > two named nodes with the same name and same parent, it should fail and
> > > it's definitely a "well don't do that then" in just the same way that
> > > one doesn't do it with kobject_add() or any of the other numerous
> > > interfaces that take names in a space that need to be unique.
> > > 
> > > I really don't think there is any issue here to be solved. In fact,
> > > I think solving it will add additional useless complexity that just
> > > isn't required - which adds extra code that can be wrong and fail.
> > > 
> > > Let's keep this simple. This approach is simple. If one does something
> > > stupid (like creating two named nodes with the same name and same
> > > parent) then it will verbosely fail. That is a good thing.
> > 
> > Well, I think the most simplest approach would be to have both the
> > nodes and the properties as part of that struct mv88e6xxx_chip:
> > 
> > struct mv88e6xxx_chip {
> >         ...
> >        struct property_entry port_props[2];
> >        struct property_entry fixed_link_props[3];
> > 
> >        struct software_node port_swnode;
> >        struct software_node fixed_link_swnode;
> > };
> > 
> > That allows you to register both nodes in one go:
> > 
> > static struct fwnode_handle *mv88e6xxx_create_port_swnode(struct mv88e6xxx_chip *chip,
> >                                                           phy_interface_t mode,
> >                                                           int speed,
> >                                                           int duplex)
> > {
> >         struct software_node *nodes[3] = {
> >                 &chip->port_swnode,
> >                 &chip->fixed_link_swnode,
> >         };
> >         int ret;
> > 
> >         chip->port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> >         chip->port_swnode.properties = chip->port_props;
> > 
> >         chip->fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> >         if (duplex == DUPLEX_FULL)
> >                 chip->fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> > 
> >         chip->fixed_link_swnode.name = "fixed-link";
> >         chip->fixed_link_swnode.parent = &chip->port_swnode;
> >         chip->fixed_link_swnode.properties = chip->fixed_link_props;
> > 
> >         ret = software_node_register_node_group(nodes);
> >         if (ret)
> >                 return ERR_PTR(ret);
> > 
> >         return software_node_fwnode(&chip->port_swnode);
> > }
> 
> You're suggesting code that passes a fwnode pointer back up layers
> that has been allocated in the driver's private structure, assuming
> that those upper layers are going to release this before re-calling
> this function for a different port. They do today, but in the future?
> 
> There are always plenty of guns...
> 
> If we don't want to give the monkey the gun, we need a more complex
> solution than that... and it becomes a question about how far you
> want to take gun control.
> 
> Then there's the question about why we should have this data allocated
> permanently in the system when it is only used for a very short period
> during driver initialisation. That seems to be a complete waste of
> resources.

Also, given that the data structures get re-used, your above example
code is actually buggy - so you seem to have taken the gun and shot
yourself! Why is it buggy?

If on the first call to it, duplex is DUPLEX_FULL, then we set
fixed_link_props[1] to point at the full-duplex property. On the next
call, if we pass DUPLEX_HALF, then fixed_link_props[1] is left#
untouched and will still point at the full-duplex property.

This is a great illustration why trying to remove one gun from
circulation results in other more subtle guns being manufactured.

I'm in favour of simple obviously correct code, which is what my
proposal is. If someone does something stupid with it such as
creating two swnodes with the same name, that isn't a problem -
kobject (and sysfs) will detect the error, print a warning and
return an error - resulting in a graceful cleanup. It won't crash
the kernel.

I think you're making a mountain out of a mole hill over the "someone
might use fwnode_create_named_software_node() to create two nodes with
the same name under the same parent" issue. At least to me, it's
really not an issue that should have been raised, and I am firmly
of the opinion that this is a total non-issue.

I'm also of the opinion that trying to "fix" this non-issue creates
more problems than it solves.
Heikki Krogerus March 28, 2023, 12:09 p.m. UTC | #11
On Mon, Mar 27, 2023 at 04:45:19PM +0100, Russell King (Oracle) wrote:
> On Mon, Mar 27, 2023 at 03:32:46PM +0100, Russell King (Oracle) wrote:
> > On Mon, Mar 27, 2023 at 05:13:25PM +0300, Heikki Krogerus wrote:
> > > Hi Russell,
> > > 
> > > On Mon, Mar 27, 2023 at 11:55:00AM +0100, Russell King (Oracle) wrote:
> > > > On Mon, Mar 27, 2023 at 01:28:06PM +0300, Heikki Krogerus wrote:
> > > > > On Fri, Mar 24, 2023 at 05:04:25PM +0000, Russell King (Oracle) wrote:
> > > > > > On Fri, Mar 24, 2023 at 04:49:32PM +0200, Heikki Krogerus wrote:
> > > > > > > Hi Russell,
> > > > > > > 
> > > > > > > On Wed, Mar 22, 2023 at 12:00:21PM +0000, Russell King (Oracle) wrote:
> > > > > > > > +static struct fwnode_handle *mv88e6xxx_create_fixed_swnode(struct fwnode_handle *parent,
> > > > > > > > +							   int speed,
> > > > > > > > +							   int duplex)
> > > > > > > > +{
> > > > > > > > +	struct property_entry fixed_link_props[3] = { };
> > > > > > > > +
> > > > > > > > +	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> > > > > > > > +	if (duplex == DUPLEX_FULL)
> > > > > > > > +		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> > > > > > > > +
> > > > > > > > +	return fwnode_create_named_software_node(fixed_link_props, parent,
> > > > > > > > +						 "fixed-link");
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static struct fwnode_handle *mv88e6xxx_create_port_swnode(phy_interface_t mode,
> > > > > > > > +							  int speed,
> > > > > > > > +							  int duplex)
> > > > > > > > +{
> > > > > > > > +	struct property_entry port_props[2] = {};
> > > > > > > > +	struct fwnode_handle *fixed_link_fwnode;
> > > > > > > > +	struct fwnode_handle *new_port_fwnode;
> > > > > > > > +
> > > > > > > > +	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> > > > > > > > +	new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> > > > > > > > +	if (IS_ERR(new_port_fwnode))
> > > > > > > > +		return new_port_fwnode;
> > > > > > > > +
> > > > > > > > +	fixed_link_fwnode = mv88e6xxx_create_fixed_swnode(new_port_fwnode,
> > > > > > > > +							  speed, duplex);
> > > > > > > > +	if (IS_ERR(fixed_link_fwnode)) {
> > > > > > > > +		fwnode_remove_software_node(new_port_fwnode);
> > > > > > > > +		return fixed_link_fwnode;
> > > > > > > > +	}
> > > > > > > > +
> > > > > > > > +	return new_port_fwnode;
> > > > > > > > +}
> > > > > > > 
> > > > > > > That new fwnode_create_named_software_node() function looks like a
> > > > > > > conflict waiting to happen - if a driver adds a node to the root level
> > > > > > > (does not have to be root level), all the tests will pass because
> > > > > > > there is only a single device, but when a user later tries the driver
> > > > > > > with two devices, it fails, because the node already exist. But you
> > > > > > > don't need that function at all.
> > > > > > 
> > > > > > I think you're totally failing to explain how this can fail.
> > > > > > 
> > > > > > Let me reiterate what thestructure of the swnodes here is:
> > > > > > 
> > > > > > 	root
> > > > > > 	`- node%d (%d allocated by root IDA)
> > > > > > 	   +- phy-mode property
> > > > > > 	   `- fixed-link
> > > > > > 	      +- speed property
> > > > > > 	      `- optional full-duplex property
> > > > > > 
> > > > > > If we have two different devices creating these nodes, then at the
> > > > > > root level, they will end up having different root names. The
> > > > > > "fixed-link" is a child of this node.
> > > > > 
> > > > > Ah, sorry, the problem is not with this patch, or your use case. The
> > > > > problem is with the PATCH 1/7 of this series where you introduce that
> > > > > new function fwnode_create_named_software_node() which will not be
> > > > > tied to your use case only. In this patch you just use that function.
> > > > > I should have been more clear on that.
> > > > 
> > > > How is this any different from creating two struct device's with the
> > > > same parent and the same name? Or kobject_add() with the same parent
> > > > and name?
> > > 
> > > But that can not mean we have to take the same risk everywhere. I do
> > > understand that we don't protect developers from doing silly decisions
> > > in kernel, but that does not mean that we should simply accept
> > > interfaces into the kernel that expose these risk if we don't need
> > > them.
> > > 
> > > > > I really just wanted to show how you can create those nodes by using
> > > > > the API designed for the statically described software nodes. So you
> > > > > don't need that new function. Please check that proposal from my
> > > > > original reply.
> > > > 
> > > > I don't see why I should. This is clearly a case that if one creates
> > > > two named nodes with the same name and same parent, it should fail and
> > > > it's definitely a "well don't do that then" in just the same way that
> > > > one doesn't do it with kobject_add() or any of the other numerous
> > > > interfaces that take names in a space that need to be unique.
> > > > 
> > > > I really don't think there is any issue here to be solved. In fact,
> > > > I think solving it will add additional useless complexity that just
> > > > isn't required - which adds extra code that can be wrong and fail.
> > > > 
> > > > Let's keep this simple. This approach is simple. If one does something
> > > > stupid (like creating two named nodes with the same name and same
> > > > parent) then it will verbosely fail. That is a good thing.
> > > 
> > > Well, I think the most simplest approach would be to have both the
> > > nodes and the properties as part of that struct mv88e6xxx_chip:
> > > 
> > > struct mv88e6xxx_chip {
> > >         ...
> > >        struct property_entry port_props[2];
> > >        struct property_entry fixed_link_props[3];
> > > 
> > >        struct software_node port_swnode;
> > >        struct software_node fixed_link_swnode;
> > > };
> > > 
> > > That allows you to register both nodes in one go:
> > > 
> > > static struct fwnode_handle *mv88e6xxx_create_port_swnode(struct mv88e6xxx_chip *chip,
> > >                                                           phy_interface_t mode,
> > >                                                           int speed,
> > >                                                           int duplex)
> > > {
> > >         struct software_node *nodes[3] = {
> > >                 &chip->port_swnode,
> > >                 &chip->fixed_link_swnode,
> > >         };
> > >         int ret;
> > > 
> > >         chip->port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
> > >         chip->port_swnode.properties = chip->port_props;
> > > 
> > >         chip->fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
> > >         if (duplex == DUPLEX_FULL)
> > >                 chip->fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
> > > 
> > >         chip->fixed_link_swnode.name = "fixed-link";
> > >         chip->fixed_link_swnode.parent = &chip->port_swnode;
> > >         chip->fixed_link_swnode.properties = chip->fixed_link_props;
> > > 
> > >         ret = software_node_register_node_group(nodes);
> > >         if (ret)
> > >                 return ERR_PTR(ret);
> > > 
> > >         return software_node_fwnode(&chip->port_swnode);
> > > }
> > 
> > You're suggesting code that passes a fwnode pointer back up layers
> > that has been allocated in the driver's private structure, assuming
> > that those upper layers are going to release this before re-calling
> > this function for a different port. They do today, but in the future?
> > 
> > There are always plenty of guns...
> > 
> > If we don't want to give the monkey the gun, we need a more complex
> > solution than that... and it becomes a question about how far you
> > want to take gun control.
> > 
> > Then there's the question about why we should have this data allocated
> > permanently in the system when it is only used for a very short period
> > during driver initialisation. That seems to be a complete waste of
> > resources.
> 
> Also, given that the data structures get re-used, your above example
> code is actually buggy - so you seem to have taken the gun and shot
> yourself! Why is it buggy?

> If on the first call to it, duplex is DUPLEX_FULL, then we set
> fixed_link_props[1] to point at the full-duplex property. On the next
> call, if we pass DUPLEX_HALF, then fixed_link_props[1] is left#
> untouched and will still point at the full-duplex property.

I have not shared a patch with you, I've only shared code snippet with
you - an example like you said. The only purpose of it is to give you
a rough idea how you can use the API. Please note that I did also
explain you that if you need to allocate those structures then you
just go ahead allocate them (see my original reply). I'm not giving
you the final solution, only the correct approach in a form of a
sketch.

The way you adapt that into this driver is not up to me, it's
something you need to do.

> This is a great illustration why trying to remove one gun from
> circulation results in other more subtle guns being manufactured.
> 
> I'm in favour of simple obviously correct code, which is what my
> proposal is. If someone does something stupid with it such as
> creating two swnodes with the same name, that isn't a problem -
> kobject (and sysfs) will detect the error, print a warning and
> return an error - resulting in a graceful cleanup. It won't crash
> the kernel.

The problem is that the function you are proposing will be exploited
silently - people will use NULL as the parent without anybody
noticing. Everything will work for a while, because everybody will
first only have a single device for that driver. But as time goes by
and new hardware appears, suddenly there are multiple devices for
those drivers, and the conflict start to appear.

At that point the changes that added the function call will have
trickled down to the stable trees, so the distros are affected. Now we
are no longer talking about a simple cleanup that fixes the issue. In
the unlikely, but possible case, this will turn into ABI problem if
there is something in user space that for whatever reason now expects
the node to be always accessible at the same level and with the same
name.

As you pointed out, this kind of risks we have to live with kbojects,
struct device stuff and many others, but the thing is, with the
software node and device property APIs right now we don't. So the fact
that a risk exists in one place just isn't justification to accept the
same risk absolutely everywhere.

> I think you're making a mountain out of a mole hill over the "someone
> might use fwnode_create_named_software_node() to create two nodes with
> the same name under the same parent" issue. At least to me, it's
> really not an issue that should have been raised, and I am firmly
> of the opinion that this is a total non-issue.
> 
> I'm also of the opinion that trying to "fix" this non-issue creates
> more problems than it solves.

Russell, if you have some good arguments for accepting your proposal,
I assure you I will agree with you, but so far all you have given are
attacks on a sketch details and statements like that "I think you're
making a mountain out of a mole". Those just are not good enough.

I've explained, repeatedly, the risk involved with your proposal. Your
counter arguments to that has been that the same risk exists
elsewhere, which does not matter like I explained above, and basically
that we can live with the risk, which really should not be acceptable
answer in general, but in this case since we have an alternative
approach it is definitely not an argument.

Your claim that using the existing API adds complexity is also
complete bogus, because the changes you will need to do to this patch
(this driver) are going to very small - the code will not look that
different. Just try the existing API.

So please note that I'm not trying to "fix" anything here - I don't
have to. Right now it seems all I'm doing is preventing useless
function from being added to the kernel.

thanks,
Russell King (Oracle) March 28, 2023, 1:23 p.m. UTC | #12
On Tue, Mar 28, 2023 at 03:09:56PM +0300, Heikki Krogerus wrote:
> The problem is that the function you are proposing will be exploited
> silently - people will use NULL as the parent without anybody
> noticing. Everything will work for a while, because everybody will
> first only have a single device for that driver. But as time goes by
> and new hardware appears, suddenly there are multiple devices for
> those drivers, and the conflict start to appear.

So, an easy solution would be to reject a call to
fwnode_create_named_software_node() when parent is NULL, thereby
preventing named nodes at the root level.

> At that point the changes that added the function call will have
> trickled down to the stable trees, so the distros are affected. Now we
> are no longer talking about a simple cleanup that fixes the issue. In
> the unlikely, but possible case, this will turn into ABI problem if

There is no such thing as stable APIs for internal kernel interfaces.

Documentation/process/stable-api-nonsense.rst

> As you pointed out, this kind of risks we have to live with kbojects,
> struct device stuff and many others, but the thing is, with the
> software node and device property APIs right now we don't. So the fact
> that a risk exists in one place just isn't justification to accept the
> same risk absolutely everywhere.

Meanwhile, firmware descriptions explicitly permit looking up nodes by
their names, but here we are, with the software node maintainers
basically stating that they don't wish to support creating software
nodes with explicit names.

> Russell, if you have some good arguments for accepting your proposal,
> I assure you I will agree with you, but so far all you have given are
> attacks on a sketch details and statements like that "I think you're
> making a mountain out of a mole". Those just are not good enough.

Basically, I think you are outright wrong for all the reasons I have
given in all my emails on this subject.

Yes, I accept there is a *slight* risk of abuse, but I see it as no
different from the risk from incorrect usage of any other kernel
internal interface. Therefore I just do not accept your argument
that we should not have this function, and I do not accept your
reasoning.
Heikki Krogerus March 29, 2023, 2:07 p.m. UTC | #13
On Tue, Mar 28, 2023 at 02:23:41PM +0100, Russell King (Oracle) wrote:
> On Tue, Mar 28, 2023 at 03:09:56PM +0300, Heikki Krogerus wrote:
> > The problem is that the function you are proposing will be exploited
> > silently - people will use NULL as the parent without anybody
> > noticing. Everything will work for a while, because everybody will
> > first only have a single device for that driver. But as time goes by
> > and new hardware appears, suddenly there are multiple devices for
> > those drivers, and the conflict start to appear.
> 
> So, an easy solution would be to reject a call to
> fwnode_create_named_software_node() when parent is NULL, thereby
> preventing named nodes at the root level.
> 
> > At that point the changes that added the function call will have
> > trickled down to the stable trees, so the distros are affected. Now we
> > are no longer talking about a simple cleanup that fixes the issue. In
> > the unlikely, but possible case, this will turn into ABI problem if
> 
> There is no such thing as stable APIs for internal kernel interfaces.
> 
> Documentation/process/stable-api-nonsense.rst
> 
> > As you pointed out, this kind of risks we have to live with kbojects,
> > struct device stuff and many others, but the thing is, with the
> > software node and device property APIs right now we don't. So the fact
> > that a risk exists in one place just isn't justification to accept the
> > same risk absolutely everywhere.
> 
> Meanwhile, firmware descriptions explicitly permit looking up nodes by
> their names, but here we are, with the software node maintainers
> basically stating that they don't wish to support creating software
> nodes with explicit names.

If you want to name the nodes then you just go ahead and name them,
nobody is preventing you and you can already do that, but if you do
so, then you will take full responsibility of the entire software node
- that is what you are naming here - instead of just the fwnode that
it contains. The users of the node can deal with the fwnode alone, but
you as the creator of the software node have to take proper ownership
of it.

> > Russell, if you have some good arguments for accepting your proposal,
> > I assure you I will agree with you, but so far all you have given are
> > attacks on a sketch details and statements like that "I think you're
> > making a mountain out of a mole". Those just are not good enough.
> 
> Basically, I think you are outright wrong for all the reasons I have
> given in all my emails on this subject.
> 
> Yes, I accept there is a *slight* risk of abuse, but I see it as no
> different from the risk from incorrect usage of any other kernel
> internal interface. Therefore I just do not accept your argument
> that we should not have this function, and I do not accept your
> reasoning.

I would not be so against the function if there wasn't any other way
to handle your case, but there is.

You really can not claim that the existing API is in any way inferior,
or even more complex, compared to your function before you actually
try it. You simply can not make judgement based on a sketch that is
basically just showing you the functions and structures that you need.

If there are issues with the API, then we need to of course fix those
issues, but please keep in mind that still does not mean we have any
need for the function you are proposing.

Please also note that helpers are welcome if you feel we need them. If
you want to add for example an allocation routine that duplicates also
the properties in one go, then that alone would reduce the complexity
needed in the drivers that create the nodes. I think in most cases,
possibly also in yours, that alone would allow most stuff to be
handled from stack memory.

fwnode_create_software_node() is there just to support the legacy
device properties. You really should not be using even that. If you
need to deal with software nodes then you deal with them with struct
software_node.

thanks,
Russell King (Oracle) March 29, 2023, 2:33 p.m. UTC | #14
On Wed, Mar 29, 2023 at 05:07:26PM +0300, Heikki Krogerus wrote:
> On Tue, Mar 28, 2023 at 02:23:41PM +0100, Russell King (Oracle) wrote:
> > On Tue, Mar 28, 2023 at 03:09:56PM +0300, Heikki Krogerus wrote:
> > > The problem is that the function you are proposing will be exploited
> > > silently - people will use NULL as the parent without anybody
> > > noticing. Everything will work for a while, because everybody will
> > > first only have a single device for that driver. But as time goes by
> > > and new hardware appears, suddenly there are multiple devices for
> > > those drivers, and the conflict start to appear.
> > 
> > So, an easy solution would be to reject a call to
> > fwnode_create_named_software_node() when parent is NULL, thereby
> > preventing named nodes at the root level.
> > 
> > > At that point the changes that added the function call will have
> > > trickled down to the stable trees, so the distros are affected. Now we
> > > are no longer talking about a simple cleanup that fixes the issue. In
> > > the unlikely, but possible case, this will turn into ABI problem if
> > 
> > There is no such thing as stable APIs for internal kernel interfaces.
> > 
> > Documentation/process/stable-api-nonsense.rst
> > 
> > > As you pointed out, this kind of risks we have to live with kbojects,
> > > struct device stuff and many others, but the thing is, with the
> > > software node and device property APIs right now we don't. So the fact
> > > that a risk exists in one place just isn't justification to accept the
> > > same risk absolutely everywhere.
> > 
> > Meanwhile, firmware descriptions explicitly permit looking up nodes by
> > their names, but here we are, with the software node maintainers
> > basically stating that they don't wish to support creating software
> > nodes with explicit names.
> 
> If you want to name the nodes then you just go ahead and name them,
> nobody is preventing you and you can already do that, but if you do
> so, then you will take full responsibility of the entire software node
> - that is what you are naming here - instead of just the fwnode that
> it contains. The users of the node can deal with the fwnode alone, but
> you as the creator of the software node have to take proper ownership
> of it.
> 
> > > Russell, if you have some good arguments for accepting your proposal,
> > > I assure you I will agree with you, but so far all you have given are
> > > attacks on a sketch details and statements like that "I think you're
> > > making a mountain out of a mole". Those just are not good enough.
> > 
> > Basically, I think you are outright wrong for all the reasons I have
> > given in all my emails on this subject.
> > 
> > Yes, I accept there is a *slight* risk of abuse, but I see it as no
> > different from the risk from incorrect usage of any other kernel
> > internal interface. Therefore I just do not accept your argument
> > that we should not have this function, and I do not accept your
> > reasoning.
> 
> I would not be so against the function if there wasn't any other way
> to handle your case, but there is.
> 
> You really can not claim that the existing API is in any way inferior,
> or even more complex, compared to your function before you actually
> try it. You simply can not make judgement based on a sketch that is
> basically just showing you the functions and structures that you need.
> 
> If there are issues with the API, then we need to of course fix those
> issues, but please keep in mind that still does not mean we have any
> need for the function you are proposing.
> 
> Please also note that helpers are welcome if you feel we need them. If
> you want to add for example an allocation routine that duplicates also
> the properties in one go, then that alone would reduce the complexity
> needed in the drivers that create the nodes. I think in most cases,
> possibly also in yours, that alone would allow most stuff to be
> handled from stack memory.
> 
> fwnode_create_software_node() is there just to support the legacy
> device properties. You really should not be using even that. If you
> need to deal with software nodes then you deal with them with struct
> software_node.

You forgot to explain how to free them once they're done, because
struct swnode will contain a pointer to the struct software_node
which can be a dangling stale reference - and there's no way for
code outside swnode.c to know when that reference has gone.

That is another reason why I prefer my existing solution. That
problem is taken care of already by the existing code - and as
it's taken care of there, and properly, there's less possibilities
for users of swnode to get it wrong.
Heikki Krogerus March 30, 2023, 1:54 p.m. UTC | #15
On Wed, Mar 29, 2023 at 03:33:48PM +0100, Russell King (Oracle) wrote:
> On Wed, Mar 29, 2023 at 05:07:26PM +0300, Heikki Krogerus wrote:
> > On Tue, Mar 28, 2023 at 02:23:41PM +0100, Russell King (Oracle) wrote:
> > > On Tue, Mar 28, 2023 at 03:09:56PM +0300, Heikki Krogerus wrote:
> > > > The problem is that the function you are proposing will be exploited
> > > > silently - people will use NULL as the parent without anybody
> > > > noticing. Everything will work for a while, because everybody will
> > > > first only have a single device for that driver. But as time goes by
> > > > and new hardware appears, suddenly there are multiple devices for
> > > > those drivers, and the conflict start to appear.
> > > 
> > > So, an easy solution would be to reject a call to
> > > fwnode_create_named_software_node() when parent is NULL, thereby
> > > preventing named nodes at the root level.
> > > 
> > > > At that point the changes that added the function call will have
> > > > trickled down to the stable trees, so the distros are affected. Now we
> > > > are no longer talking about a simple cleanup that fixes the issue. In
> > > > the unlikely, but possible case, this will turn into ABI problem if
> > > 
> > > There is no such thing as stable APIs for internal kernel interfaces.
> > > 
> > > Documentation/process/stable-api-nonsense.rst
> > > 
> > > > As you pointed out, this kind of risks we have to live with kbojects,
> > > > struct device stuff and many others, but the thing is, with the
> > > > software node and device property APIs right now we don't. So the fact
> > > > that a risk exists in one place just isn't justification to accept the
> > > > same risk absolutely everywhere.
> > > 
> > > Meanwhile, firmware descriptions explicitly permit looking up nodes by
> > > their names, but here we are, with the software node maintainers
> > > basically stating that they don't wish to support creating software
> > > nodes with explicit names.
> > 
> > If you want to name the nodes then you just go ahead and name them,
> > nobody is preventing you and you can already do that, but if you do
> > so, then you will take full responsibility of the entire software node
> > - that is what you are naming here - instead of just the fwnode that
> > it contains. The users of the node can deal with the fwnode alone, but
> > you as the creator of the software node have to take proper ownership
> > of it.
> > 
> > > > Russell, if you have some good arguments for accepting your proposal,
> > > > I assure you I will agree with you, but so far all you have given are
> > > > attacks on a sketch details and statements like that "I think you're
> > > > making a mountain out of a mole". Those just are not good enough.
> > > 
> > > Basically, I think you are outright wrong for all the reasons I have
> > > given in all my emails on this subject.
> > > 
> > > Yes, I accept there is a *slight* risk of abuse, but I see it as no
> > > different from the risk from incorrect usage of any other kernel
> > > internal interface. Therefore I just do not accept your argument
> > > that we should not have this function, and I do not accept your
> > > reasoning.
> > 
> > I would not be so against the function if there wasn't any other way
> > to handle your case, but there is.
> > 
> > You really can not claim that the existing API is in any way inferior,
> > or even more complex, compared to your function before you actually
> > try it. You simply can not make judgement based on a sketch that is
> > basically just showing you the functions and structures that you need.
> > 
> > If there are issues with the API, then we need to of course fix those
> > issues, but please keep in mind that still does not mean we have any
> > need for the function you are proposing.
> > 
> > Please also note that helpers are welcome if you feel we need them. If
> > you want to add for example an allocation routine that duplicates also
> > the properties in one go, then that alone would reduce the complexity
> > needed in the drivers that create the nodes. I think in most cases,
> > possibly also in yours, that alone would allow most stuff to be
> > handled from stack memory.
> > 
> > fwnode_create_software_node() is there just to support the legacy
> > device properties. You really should not be using even that. If you
> > need to deal with software nodes then you deal with them with struct
> > software_node.
> 
> You forgot to explain how to free them once they're done, because
> struct swnode will contain a pointer to the struct software_node
> which can be a dangling stale reference - and there's no way for
> code outside swnode.c to know when that reference has gone.
> 
> That is another reason why I prefer my existing solution. That
> problem is taken care of already by the existing code - and as
> it's taken care of there, and properly, there's less possibilities
> for users of swnode to get it wrong.

We need an improved release mechanism, yes.

My idea with the new dynamic allocation routine was that it could be
introduced together with a release callback that we add to the struct
software_node.

The idea of adding the release callback to the structure was actually
considered already some time ago - I think it was discussed at least
shortly also on the public ACPI mailing list. The idea back then
included a default release function that simply frees the struct
software_node instance. That default release function we could then
assign to the release callback in that new software node
allocation/creation routine. That way the drivers should be able to
continue to rely on the underlying code to take care of freeing the
node instance.

Back then there was nobody who really needed that functionality, so
nobody even tried to implement it. Now we of course clearly do need
something like it.

I think the release callback together with the default release
function should work. Let me know what you guys think.

thanks,
Russell King (Oracle) April 3, 2023, 1:02 p.m. UTC | #16
On Thu, Mar 30, 2023 at 04:54:51PM +0300, Heikki Krogerus wrote:
> On Wed, Mar 29, 2023 at 03:33:48PM +0100, Russell King (Oracle) wrote:
> > On Wed, Mar 29, 2023 at 05:07:26PM +0300, Heikki Krogerus wrote:
> > > On Tue, Mar 28, 2023 at 02:23:41PM +0100, Russell King (Oracle) wrote:
> > > > On Tue, Mar 28, 2023 at 03:09:56PM +0300, Heikki Krogerus wrote:
> > > > > The problem is that the function you are proposing will be exploited
> > > > > silently - people will use NULL as the parent without anybody
> > > > > noticing. Everything will work for a while, because everybody will
> > > > > first only have a single device for that driver. But as time goes by
> > > > > and new hardware appears, suddenly there are multiple devices for
> > > > > those drivers, and the conflict start to appear.
> > > > 
> > > > So, an easy solution would be to reject a call to
> > > > fwnode_create_named_software_node() when parent is NULL, thereby
> > > > preventing named nodes at the root level.
> > > > 
> > > > > At that point the changes that added the function call will have
> > > > > trickled down to the stable trees, so the distros are affected. Now we
> > > > > are no longer talking about a simple cleanup that fixes the issue. In
> > > > > the unlikely, but possible case, this will turn into ABI problem if
> > > > 
> > > > There is no such thing as stable APIs for internal kernel interfaces.
> > > > 
> > > > Documentation/process/stable-api-nonsense.rst
> > > > 
> > > > > As you pointed out, this kind of risks we have to live with kbojects,
> > > > > struct device stuff and many others, but the thing is, with the
> > > > > software node and device property APIs right now we don't. So the fact
> > > > > that a risk exists in one place just isn't justification to accept the
> > > > > same risk absolutely everywhere.
> > > > 
> > > > Meanwhile, firmware descriptions explicitly permit looking up nodes by
> > > > their names, but here we are, with the software node maintainers
> > > > basically stating that they don't wish to support creating software
> > > > nodes with explicit names.
> > > 
> > > If you want to name the nodes then you just go ahead and name them,
> > > nobody is preventing you and you can already do that, but if you do
> > > so, then you will take full responsibility of the entire software node
> > > - that is what you are naming here - instead of just the fwnode that
> > > it contains. The users of the node can deal with the fwnode alone, but
> > > you as the creator of the software node have to take proper ownership
> > > of it.
> > > 
> > > > > Russell, if you have some good arguments for accepting your proposal,
> > > > > I assure you I will agree with you, but so far all you have given are
> > > > > attacks on a sketch details and statements like that "I think you're
> > > > > making a mountain out of a mole". Those just are not good enough.
> > > > 
> > > > Basically, I think you are outright wrong for all the reasons I have
> > > > given in all my emails on this subject.
> > > > 
> > > > Yes, I accept there is a *slight* risk of abuse, but I see it as no
> > > > different from the risk from incorrect usage of any other kernel
> > > > internal interface. Therefore I just do not accept your argument
> > > > that we should not have this function, and I do not accept your
> > > > reasoning.
> > > 
> > > I would not be so against the function if there wasn't any other way
> > > to handle your case, but there is.
> > > 
> > > You really can not claim that the existing API is in any way inferior,
> > > or even more complex, compared to your function before you actually
> > > try it. You simply can not make judgement based on a sketch that is
> > > basically just showing you the functions and structures that you need.
> > > 
> > > If there are issues with the API, then we need to of course fix those
> > > issues, but please keep in mind that still does not mean we have any
> > > need for the function you are proposing.
> > > 
> > > Please also note that helpers are welcome if you feel we need them. If
> > > you want to add for example an allocation routine that duplicates also
> > > the properties in one go, then that alone would reduce the complexity
> > > needed in the drivers that create the nodes. I think in most cases,
> > > possibly also in yours, that alone would allow most stuff to be
> > > handled from stack memory.
> > > 
> > > fwnode_create_software_node() is there just to support the legacy
> > > device properties. You really should not be using even that. If you
> > > need to deal with software nodes then you deal with them with struct
> > > software_node.
> > 
> > You forgot to explain how to free them once they're done, because
> > struct swnode will contain a pointer to the struct software_node
> > which can be a dangling stale reference - and there's no way for
> > code outside swnode.c to know when that reference has gone.
> > 
> > That is another reason why I prefer my existing solution. That
> > problem is taken care of already by the existing code - and as
> > it's taken care of there, and properly, there's less possibilities
> > for users of swnode to get it wrong.
> 
> We need an improved release mechanism, yes.
> 
> My idea with the new dynamic allocation routine was that it could be
> introduced together with a release callback that we add to the struct
> software_node.
> 
> The idea of adding the release callback to the structure was actually
> considered already some time ago - I think it was discussed at least
> shortly also on the public ACPI mailing list. The idea back then
> included a default release function that simply frees the struct
> software_node instance. That default release function we could then
> assign to the release callback in that new software node
> allocation/creation routine. That way the drivers should be able to
> continue to rely on the underlying code to take care of freeing the
> node instance.
> 
> Back then there was nobody who really needed that functionality, so
> nobody even tried to implement it. Now we of course clearly do need
> something like it.
> 
> I think the release callback together with the default release
> function should work. Let me know what you guys think.

Thinking about this more, no. This is utterly vile, and I will not
create code that is vile.

Greg, please can you take a look at this, and give your opinion on
how named software nodes (which are required to describe things in
specific ways by firmware descriptions) should be handled? Is my
proposal reasonable in your eyes? Thanks.
Greg KH April 5, 2023, 5:51 p.m. UTC | #17
On Mon, Apr 03, 2023 at 02:02:48PM +0100, Russell King (Oracle) wrote:
> On Thu, Mar 30, 2023 at 04:54:51PM +0300, Heikki Krogerus wrote:
> > On Wed, Mar 29, 2023 at 03:33:48PM +0100, Russell King (Oracle) wrote:
> > > On Wed, Mar 29, 2023 at 05:07:26PM +0300, Heikki Krogerus wrote:
> > > > On Tue, Mar 28, 2023 at 02:23:41PM +0100, Russell King (Oracle) wrote:
> > > > > On Tue, Mar 28, 2023 at 03:09:56PM +0300, Heikki Krogerus wrote:
> > > > > > The problem is that the function you are proposing will be exploited
> > > > > > silently - people will use NULL as the parent without anybody
> > > > > > noticing. Everything will work for a while, because everybody will
> > > > > > first only have a single device for that driver. But as time goes by
> > > > > > and new hardware appears, suddenly there are multiple devices for
> > > > > > those drivers, and the conflict start to appear.
> > > > > 
> > > > > So, an easy solution would be to reject a call to
> > > > > fwnode_create_named_software_node() when parent is NULL, thereby
> > > > > preventing named nodes at the root level.
> > > > > 
> > > > > > At that point the changes that added the function call will have
> > > > > > trickled down to the stable trees, so the distros are affected. Now we
> > > > > > are no longer talking about a simple cleanup that fixes the issue. In
> > > > > > the unlikely, but possible case, this will turn into ABI problem if
> > > > > 
> > > > > There is no such thing as stable APIs for internal kernel interfaces.
> > > > > 
> > > > > Documentation/process/stable-api-nonsense.rst
> > > > > 
> > > > > > As you pointed out, this kind of risks we have to live with kbojects,
> > > > > > struct device stuff and many others, but the thing is, with the
> > > > > > software node and device property APIs right now we don't. So the fact
> > > > > > that a risk exists in one place just isn't justification to accept the
> > > > > > same risk absolutely everywhere.
> > > > > 
> > > > > Meanwhile, firmware descriptions explicitly permit looking up nodes by
> > > > > their names, but here we are, with the software node maintainers
> > > > > basically stating that they don't wish to support creating software
> > > > > nodes with explicit names.
> > > > 
> > > > If you want to name the nodes then you just go ahead and name them,
> > > > nobody is preventing you and you can already do that, but if you do
> > > > so, then you will take full responsibility of the entire software node
> > > > - that is what you are naming here - instead of just the fwnode that
> > > > it contains. The users of the node can deal with the fwnode alone, but
> > > > you as the creator of the software node have to take proper ownership
> > > > of it.
> > > > 
> > > > > > Russell, if you have some good arguments for accepting your proposal,
> > > > > > I assure you I will agree with you, but so far all you have given are
> > > > > > attacks on a sketch details and statements like that "I think you're
> > > > > > making a mountain out of a mole". Those just are not good enough.
> > > > > 
> > > > > Basically, I think you are outright wrong for all the reasons I have
> > > > > given in all my emails on this subject.
> > > > > 
> > > > > Yes, I accept there is a *slight* risk of abuse, but I see it as no
> > > > > different from the risk from incorrect usage of any other kernel
> > > > > internal interface. Therefore I just do not accept your argument
> > > > > that we should not have this function, and I do not accept your
> > > > > reasoning.
> > > > 
> > > > I would not be so against the function if there wasn't any other way
> > > > to handle your case, but there is.
> > > > 
> > > > You really can not claim that the existing API is in any way inferior,
> > > > or even more complex, compared to your function before you actually
> > > > try it. You simply can not make judgement based on a sketch that is
> > > > basically just showing you the functions and structures that you need.
> > > > 
> > > > If there are issues with the API, then we need to of course fix those
> > > > issues, but please keep in mind that still does not mean we have any
> > > > need for the function you are proposing.
> > > > 
> > > > Please also note that helpers are welcome if you feel we need them. If
> > > > you want to add for example an allocation routine that duplicates also
> > > > the properties in one go, then that alone would reduce the complexity
> > > > needed in the drivers that create the nodes. I think in most cases,
> > > > possibly also in yours, that alone would allow most stuff to be
> > > > handled from stack memory.
> > > > 
> > > > fwnode_create_software_node() is there just to support the legacy
> > > > device properties. You really should not be using even that. If you
> > > > need to deal with software nodes then you deal with them with struct
> > > > software_node.
> > > 
> > > You forgot to explain how to free them once they're done, because
> > > struct swnode will contain a pointer to the struct software_node
> > > which can be a dangling stale reference - and there's no way for
> > > code outside swnode.c to know when that reference has gone.
> > > 
> > > That is another reason why I prefer my existing solution. That
> > > problem is taken care of already by the existing code - and as
> > > it's taken care of there, and properly, there's less possibilities
> > > for users of swnode to get it wrong.
> > 
> > We need an improved release mechanism, yes.
> > 
> > My idea with the new dynamic allocation routine was that it could be
> > introduced together with a release callback that we add to the struct
> > software_node.
> > 
> > The idea of adding the release callback to the structure was actually
> > considered already some time ago - I think it was discussed at least
> > shortly also on the public ACPI mailing list. The idea back then
> > included a default release function that simply frees the struct
> > software_node instance. That default release function we could then
> > assign to the release callback in that new software node
> > allocation/creation routine. That way the drivers should be able to
> > continue to rely on the underlying code to take care of freeing the
> > node instance.
> > 
> > Back then there was nobody who really needed that functionality, so
> > nobody even tried to implement it. Now we of course clearly do need
> > something like it.
> > 
> > I think the release callback together with the default release
> > function should work. Let me know what you guys think.
> 
> Thinking about this more, no. This is utterly vile, and I will not
> create code that is vile.
> 
> Greg, please can you take a look at this, and give your opinion on
> how named software nodes (which are required to describe things in
> specific ways by firmware descriptions) should be handled? Is my
> proposal reasonable in your eyes? Thanks.

I'm lost, sorry, this thread is crazy long and I do not have the
bandwidth to try to dig through it.  I think you two need to work it out
together please.

thanks,

greg k-h

> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
diff mbox series

Patch

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index b73d1d6747b7..dde0c306fb3a 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -26,6 +26,7 @@ 
 #include <linux/of_device.h>
 #include <linux/of_irq.h>
 #include <linux/of_mdio.h>
+#include <linux/of_net.h>
 #include <linux/platform_data/mv88e6xxx.h>
 #include <linux/netdevice.h>
 #include <linux/gpio/consumer.h>
@@ -841,6 +842,113 @@  static void mv88e6xxx_get_caps(struct dsa_switch *ds, int port,
 	}
 }
 
+static struct fwnode_handle *mv88e6xxx_create_fixed_swnode(struct fwnode_handle *parent,
+							   int speed,
+							   int duplex)
+{
+	struct property_entry fixed_link_props[3] = { };
+
+	fixed_link_props[0] = PROPERTY_ENTRY_U32("speed", speed);
+	if (duplex == DUPLEX_FULL)
+		fixed_link_props[1] = PROPERTY_ENTRY_BOOL("full-duplex");
+
+	return fwnode_create_named_software_node(fixed_link_props, parent,
+						 "fixed-link");
+}
+
+static struct fwnode_handle *mv88e6xxx_create_port_swnode(phy_interface_t mode,
+							  int speed,
+							  int duplex)
+{
+	struct property_entry port_props[2] = {};
+	struct fwnode_handle *fixed_link_fwnode;
+	struct fwnode_handle *new_port_fwnode;
+
+	port_props[0] = PROPERTY_ENTRY_STRING("phy-mode", phy_modes(mode));
+	new_port_fwnode = fwnode_create_software_node(port_props, NULL);
+	if (IS_ERR(new_port_fwnode))
+		return new_port_fwnode;
+
+	fixed_link_fwnode = mv88e6xxx_create_fixed_swnode(new_port_fwnode,
+							  speed, duplex);
+	if (IS_ERR(fixed_link_fwnode)) {
+		fwnode_remove_software_node(new_port_fwnode);
+		return fixed_link_fwnode;
+	}
+
+	return new_port_fwnode;
+}
+
+static unsigned long mv88e6xxx_get_max_caps(struct phylink_config *config,
+					    phy_interface_t *mode)
+{
+	unsigned long max_caps, caps, mac_caps;
+	phy_interface_t interface;
+
+	mac_caps = config->mac_capabilities;
+	if (*mode != PHY_INTERFACE_MODE_NA)
+		return phylink_get_capabilities(*mode, mac_caps,
+						RATE_MATCH_NONE);
+
+	max_caps = 0;
+	for_each_set_bit(interface, config->supported_interfaces,
+			 PHY_INTERFACE_MODE_MAX) {
+		caps = phylink_get_capabilities(interface, mac_caps,
+						RATE_MATCH_NONE);
+		if (caps > max_caps) {
+			max_caps = caps;
+			*mode = interface;
+		}
+	}
+
+	return max_caps;
+}
+
+static struct fwnode_handle *mv88e6xxx_port_get_fwnode(struct dsa_switch *ds,
+						       int port,
+						       struct fwnode_handle *h)
+{
+	struct mv88e6xxx_chip *chip = ds->priv;
+	struct device_node *np, *phy_node;
+	int speed, duplex, err;
+	phy_interface_t mode;
+	struct dsa_port *dp;
+	unsigned long caps;
+
+	dp = dsa_to_port(ds, port);
+	if (dsa_port_is_user(dp))
+		return h;
+
+	/* No DT? Eh? */
+	np = to_of_node(h);
+	if (!np)
+		return h;
+
+	/* If a PHY, fixed-link specification, or in-band mode, then we
+	 * don't need to do any fixup. Simply return the provided fwnode.
+	 */
+	phy_node = of_parse_phandle(np, "phy-handle", 0);
+	of_node_put(phy_node);
+	if (phy_node || of_phy_is_fixed_link(np))
+		return h;
+
+	/* Get the DT specified phy-mode. If missing, try the chip's max speed
+	 * mode method if implemented, otherwise try the supported_interfaces
+	 * mask for the interface mode that gives the fastest speeds.
+	 */
+	err = of_get_phy_mode(np, &mode);
+	if (err && chip->info->ops->port_max_speed_mode)
+		mode = chip->info->ops->port_max_speed_mode(port);
+
+	caps = mv88e6xxx_get_max_caps(&dp->pl_config, &mode);
+
+	err = phylink_find_max_speed(caps, &speed, &duplex);
+	if (err)
+		return ERR_PTR(err);
+
+	return mv88e6xxx_create_port_swnode(mode, speed, duplex);
+}
+
 static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port,
 				 unsigned int mode,
 				 const struct phylink_link_state *state)
@@ -6994,6 +7102,7 @@  static const struct dsa_switch_ops mv88e6xxx_switch_ops = {
 	.teardown		= mv88e6xxx_teardown,
 	.port_setup		= mv88e6xxx_port_setup,
 	.port_teardown		= mv88e6xxx_port_teardown,
+	.port_get_fwnode	= mv88e6xxx_port_get_fwnode,
 	.phylink_get_caps	= mv88e6xxx_get_caps,
 	.phylink_mac_link_state	= mv88e6xxx_serdes_pcs_get_state,
 	.phylink_mac_config	= mv88e6xxx_mac_config,