Message ID | 3a54a6703879d10f08cf0275a2a69297ebd2b1d4.1637592133.git.geert+renesas@glider.be |
---|---|
State | New |
Headers | show |
Series | Non-const bitfield helper conversions | expand |
On 11/22/21 10:32 AM, Johannes Berg wrote: > On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: >> The existing FIELD_{GET,PREP}() macros are limited to compile-time >> constants. However, it is very common to prepare or extract bitfield >> elements where the bitfield mask is not a compile-time constant. >> > > I'm not sure it's really a good idea to add a third API here? > > We have the upper-case (constant) versions, and already > {u32,...}_get_bits()/etc. I've used these a lot (and personally prefer the lower-case ones). Your new macros don't do anything to ensure the field mask is of the right form, which is basically: (2 ^ width - 1) << shift I really like the property that the field mask must be constant. That being said, I've had to use some strange coding patterns in order to adhere to the "const only" rule in a few cases. So if you can come up with a satisfactory naming scheme I'm all for it. -Alex > Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit > architectures (afaict), so that seems a bit awkward. > > Maybe we can make {u32,...}_get_bits() be doing compile-time only checks > if it is indeed a constant? The __field_overflow() usage is already only > done if __builtin_constant_p(v), so I guess we can do the same with > __bad_mask()? > > johannes >
Hi Johannes, On Mon, Nov 22, 2021 at 5:33 PM Johannes Berg <johannes@sipsolutions.net> wrote: > On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: > > The existing FIELD_{GET,PREP}() macros are limited to compile-time > > constants. However, it is very common to prepare or extract bitfield > > elements where the bitfield mask is not a compile-time constant. > > > > I'm not sure it's really a good idea to add a third API here? > > We have the upper-case (constant) versions, and already > {u32,...}_get_bits()/etc. These don't work for non-const masks. > Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit > architectures (afaict), so that seems a bit awkward. That's a valid comment. Can be fixed by using a wrapper macro that checks if typeof(mask) == u64, and uses an __ffs64() version when needed. > Maybe we can make {u32,...}_get_bits() be doing compile-time only checks > if it is indeed a constant? The __field_overflow() usage is already only > done if __builtin_constant_p(v), so I guess we can do the same with > __bad_mask()? Are all compilers smart enough to replace the division by field_multiplier(field) by a shift? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Alex, On Tue, Nov 23, 2021 at 2:52 AM Alex Elder <elder@ieee.org> wrote: > On 11/22/21 10:32 AM, Johannes Berg wrote: > > On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: > >> The existing FIELD_{GET,PREP}() macros are limited to compile-time > >> constants. However, it is very common to prepare or extract bitfield > >> elements where the bitfield mask is not a compile-time constant. > > > > I'm not sure it's really a good idea to add a third API here? > > > > We have the upper-case (constant) versions, and already > > {u32,...}_get_bits()/etc. > > I've used these a lot (and personally prefer the lower-case ones). > > Your new macros don't do anything to ensure the field mask is > of the right form, which is basically: (2 ^ width - 1) << shift > I really like the property that the field mask must be constant. That's correct. How to enforce that in the non-const case? BUG()/WARN() is not an option ;-) > That being said, I've had to use some strange coding patterns > in order to adhere to the "const only" rule in a few cases. > So if you can come up with a satisfactory naming scheme I'm > all for it. There are plenty of drivers that handle masks stored in a data structure, so it would be good if they can use a suitable helper, as open-coding is prone to errors. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Tue, 2021-11-23 at 09:36 +0100, Geert Uytterhoeven wrote: Ah, here's your comment wrt. which one is nicer :) > > > We have the upper-case (constant) versions, and already > > > {u32,...}_get_bits()/etc. > > TBH, I don't like the *_get_bits() API: in general, u32_get_bits() does > the same as FIELD_GET(), but the order of the parameters is different? I don't really see how "the order of parameters is different" is a downside? Yeah it means if you're used to FIELD_GET() then you'll retrain, but ...? > (*_replace_bits() seems to be useful, though) Indeed. Also as I said in my other mail, the le32/be32/... variants are tremendously useful, and they fundamentally cannot be expressed with the FIELD_GET() or field_get() macros. IMHO this is a clear advantage to the typed versions, and if you ask me we should get rid of the FIELD_GETand FIELD_PREP entirely - difficult now, but at least let's not propagate that? johannes
On Tue, 23 Nov 2021 09:36:22 +0100 Geert Uytterhoeven wrote: > > > Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit > > > architectures (afaict), so that seems a bit awkward. > > > > > > Maybe we can make {u32,...}_get_bits() be doing compile-time only checks > > > if it is indeed a constant? The __field_overflow() usage is already only > > > done if __builtin_constant_p(v), so I guess we can do the same with > > > __bad_mask()? > > > > Either that or add decomposition macros. Are compilers still really bad > > at passing small structs by value? > > Sorry, I don't get what you mean by adding decomposition macros. > Can you please elaborate? #define DECOMPOSE(_mask) \ (struct bf){ .mask = _mask, .shf = __bf_shf(_mask), } Then drivers can save or pass around the mask and shift params broken apart as a small struct.
Geert Uytterhoeven <geert@linux-m68k.org> writes: > Hi Johannes, > > On Mon, Nov 22, 2021 at 5:33 PM Johannes Berg <johannes@sipsolutions.net> wrote: >> On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: >> > The existing FIELD_{GET,PREP}() macros are limited to compile-time >> > constants. However, it is very common to prepare or extract bitfield >> > elements where the bitfield mask is not a compile-time constant. >> > >> >> I'm not sure it's really a good idea to add a third API here? >> >> We have the upper-case (constant) versions, and already >> {u32,...}_get_bits()/etc. > > These don't work for non-const masks. > >> Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit >> architectures (afaict), so that seems a bit awkward. > > That's a valid comment. Can be fixed by using a wrapper macro > that checks if typeof(mask) == u64, and uses an __ffs64() version when > needed. > >> Maybe we can make {u32,...}_get_bits() be doing compile-time only checks >> if it is indeed a constant? The __field_overflow() usage is already only >> done if __builtin_constant_p(v), so I guess we can do the same with >> __bad_mask()? > > Are all compilers smart enough to replace the division by > field_multiplier(field) by a shift? It looks like the answer is no as few weeks back I received a comment internally that a team is seeing a slow down with u32_get_bits(): "Time taken for executing both the macros/inline function (in terms of microseconds) (out of 3 Trails) FIELD_GET : 32, 31, 32 u32_get_bits : 6379, 6664, 6558" Sadly I didn't realise to ask what compiler they were using. But I still prefer {u32,...}_get_bits() over FIELD_GET(), they are just so much cleaner to use.
On Wed, 24 Nov 2021 09:03:24 +0100 Johannes Berg wrote: > On Tue, 2021-11-23 at 15:49 -0800, Jakub Kicinski wrote: > > > Indeed. > > > > > > Also as I said in my other mail, the le32/be32/... variants are > > > tremendously useful, and they fundamentally cannot be expressed with the > > > FIELD_GET() or field_get() macros. IMHO this is a clear advantage to the > > > > Can you elaborate? > > Well, the way I see it, the only advantage of FIELD_GET() is that it > will auto-determine the type (based on the mask type.) This cannot work > if you need be/le conversions, because the be/le type annotations are > invisible to the compiler. > > So obviously you could write a BE32_FIELD_GET(), but then really that's > equivalent to be32_get_bits() - note you you have to actually specify > the type in the macro name. I guess in theory you could make macros > where the type is an argument (like FIELD_GET_TYPE(be32, ...)), but I > don't see how that gains anything. Ah, that's what you meant! Thanks for spelling it out. FWIW I never found the be/le versions useful. Most of the time the data comes from bus accessors which swap or is unaligned so you have to do be/le_get_unaligned, which swaps. Plus if you access/set multiple fields you'd swap them one by one which seems wasteful. > > > typed versions, and if you ask me we should get rid of the FIELD_GETand > > > FIELD_PREP entirely - difficult now, but at least let's not propagate > > > that? > > > > I don't see why. > > Just for being more regular, in the spirit of "there's exactly one > correct way of doing it" :) Right now it seems the uppercase macros are more prevalent. Could just be because of the way the "swapping ones" are defined.
diff --git a/drivers/clk/at91/clk-peripheral.c b/drivers/clk/at91/clk-peripheral.c index e14fa5ac734cead7..e2f33498139a1b8c 100644 --- a/drivers/clk/at91/clk-peripheral.c +++ b/drivers/clk/at91/clk-peripheral.c @@ -3,6 +3,7 @@ * Copyright (C) 2013 Boris BREZILLON <b.brezillon@overkiz.com> */ +#include <linux/bitfield.h> #include <linux/bitops.h> #include <linux/clk-provider.h> #include <linux/clkdev.h> diff --git a/drivers/clk/at91/pmc.h b/drivers/clk/at91/pmc.h index 3a1bf6194c287d09..1256e1ab91526a25 100644 --- a/drivers/clk/at91/pmc.h +++ b/drivers/clk/at91/pmc.h @@ -114,9 +114,6 @@ struct at91_clk_pms { unsigned int parent; }; -#define field_get(_mask, _reg) (((_reg) & (_mask)) >> (ffs(_mask) - 1)) -#define field_prep(_mask, _val) (((_val) << (ffs(_mask) - 1)) & (_mask)) - #define ndck(a, s) (a[s - 1].id + 1) #define nck(a) (a[ARRAY_SIZE(a) - 1].id + 1) struct pmc_data *pmc_data_allocate(unsigned int ncore, unsigned int nsystem, diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h index 4e035aca6f7e6000..f03b0712e4babec1 100644 --- a/include/linux/bitfield.h +++ b/include/linux/bitfield.h @@ -156,4 +156,34 @@ __MAKE_OP(64) #undef __MAKE_OP #undef ____MAKE_OP +/** + * field_prep() - prepare a bitfield element + * @_mask: shifted mask defining the field's length and position + * @_val: value to put in the field + * + * field_prep() masks and shifts up the value. The result should be + * combined with other fields of the bitfield using logical OR. + * Unlike FIELD_PREP(), @_mask is not limited to a compile-time constant. + */ +#define field_prep(_mask, _val) \ + ({ \ + typeof(_mask) ___mask = (_mask); \ + (((_val) << __ffs(___mask)) & (___mask)); \ + }) + +/** + * field_get() - extract a bitfield element + * @_mask: shifted mask defining the field's length and position + * @_reg: value of entire bitfield + * + * field_get() extracts the field specified by @_mask from the + * bitfield passed in as @_reg by masking and shifting it down. + * Unlike FIELD_GET(), @_mask is not limited to a compile-time constant. + */ +#define field_get(_mask, _reg) \ + ({ \ + typeof(_mask) ___mask = _mask; \ + (((_reg) & (___mask)) >> __ffs(___mask)); \ + }) + #endif
The existing FIELD_{GET,PREP}() macros are limited to compile-time constants. However, it is very common to prepare or extract bitfield elements where the bitfield mask is not a compile-time constant. To avoid this limitation, the AT91 clock driver already has its own field_{prep,get}() macros. Make them available for general use by moving them to <linux/bitfield.h>, and improve them slightly: 1. Avoid evaluating macro parameters more than once, 2. Replace "ffs() - 1" by "__ffs()", as the latter operates on "unsigned long", just like BIT() and GENMASK(). Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> --- Using __ffs() actually reduces code size (16 bytes for each of drivers/clk/at91/clk-{generated,peripheral}.o), as __ffs() doesn't need to verify that the value passed is non-zero. --- drivers/clk/at91/clk-peripheral.c | 1 + drivers/clk/at91/pmc.h | 3 --- include/linux/bitfield.h | 30 ++++++++++++++++++++++++++++++ 3 files changed, 31 insertions(+), 3 deletions(-)