Message ID | 1587659482-27133-2-git-send-email-phil.yang@arm.com |
---|---|
State | New |
Headers | show |
Series | [v2,1/6] service: fix race condition for MT unsafe service | expand |
> -----Original Message----- > From: Phil Yang <phil.yang@arm.com> > Sent: Thursday, April 23, 2020 5:31 PM > To: Van Haaren, Harry <harry.van.haaren@intel.com>; dev@dpdk.org > Cc: thomas@monjalon.net; david.marchand@redhat.com; Ananyev, Konstantin > <konstantin.ananyev@intel.com>; jerinj@marvell.com; > hemant.agrawal@nxp.com; Honnappa.Nagarahalli@arm.com; > gavin.hu@arm.com; nd@arm.com; Honnappa Nagarahalli > <honnappa.nagarahalli@arm.com>; stable@dpdk.org > Subject: [PATCH v2 1/6] service: fix race condition for MT unsafe service > > From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > The MT unsafe service might get configured to run on another core > while the service is running currently. This might result in the > MT unsafe service running on multiple cores simultaneously. Use > 'execute_lock' always when the service is MT unsafe. > > Fixes: e9139a32f6e8 ("service: add function to run on app lcore") > Cc: stable@dpdk.org > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > Reviewed-by: Phil Yang <phil.yang@arm.com> > --- Thanks for spinning a new revision - based on ML discussion previously, it seems like the "use service-run-count" to avoid this race would be a complex solution. Suggesting the following; 1) Take the approach as per this patch, to always take the atomic, fixing the race condition. 2) Add an API to service-cores, which allows "committing" of mappings. Committing the mapping would imply that the mappings will not be changed in future. With runtime-remapping being removed from the equation, the existing branch-over-atomic optimization is valid again. So this would offer applications two situations A) No application change: possible performance regression due to atomic always taken. B) Call "commit" API, and regain the performance as per previous DPDK versions. Thoughts/opinions on the above? I've flagged the rest of the patchset for review ASAP. Regards, -Harry > lib/librte_eal/common/rte_service.c | 11 +++++------ > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/lib/librte_eal/common/rte_service.c > b/lib/librte_eal/common/rte_service.c > index 70d17a5..b8c465e 100644 > --- a/lib/librte_eal/common/rte_service.c > +++ b/lib/librte_eal/common/rte_service.c > @@ -50,6 +50,10 @@ struct rte_service_spec_impl { > uint8_t internal_flags; > > /* per service statistics */ > + /* Indicates how many cores the service is mapped to run on. > + * It does not indicate the number of cores the service is running > + * on currently. > + */ > rte_atomic32_t num_mapped_cores; > uint64_t calls; > uint64_t cycles_spent; > @@ -370,12 +374,7 @@ service_run(uint32_t i, struct core_state *cs, uint64_t > service_mask, > > cs->service_active_on_lcore[i] = 1; > > - /* check do we need cmpset, if MT safe or <= 1 core > - * mapped, atomic ops are not required. > - */ > - const int use_atomics = (service_mt_safe(s) == 0) && > - (rte_atomic32_read(&s->num_mapped_cores) > 1); > - if (use_atomics) { > + if (service_mt_safe(s) == 0) { > if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1)) > return -EBUSY; > > -- > 2.7.4
Hi Harry, Thanks for getting back on this. <snip> > > Subject: [PATCH v2 1/6] service: fix race condition for MT unsafe > > service > > > > From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > > > The MT unsafe service might get configured to run on another core > > while the service is running currently. This might result in the MT > > unsafe service running on multiple cores simultaneously. Use > > 'execute_lock' always when the service is MT unsafe. > > > > Fixes: e9139a32f6e8 ("service: add function to run on app lcore") > > Cc: stable@dpdk.org > > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > Reviewed-by: Phil Yang <phil.yang@arm.com> > > --- > > Thanks for spinning a new revision - based on ML discussion previously, it > seems like the "use service-run-count" to avoid this race would be a complex > solution. > > Suggesting the following; > 1) Take the approach as per this patch, to always take the atomic, fixing the > race condition. Ok > 2) Add an API to service-cores, which allows "committing" of mappings. > Committing the mapping would imply that the mappings will not be changed > in future. With runtime-remapping being removed from the equation, the > existing branch-over-atomic optimization is valid again. Ok. Just to make sure I understand this: a) on the data plane, if commit API is called (probably a new state variable) and num_mapped_cores is set to 1, there is no need to take the lock. b) possible implementation of the commit API would check if num_mapped_cores for the service is set to 1 and set a variable to indicate that the lock is not required. What do you think about asking the application to set the service capability to MT_SAFE if it knows that the service will run on a single core? This would require us to change the documentation and does not require additional code. > > So this would offer applications two situations > A) No application change: possible performance regression due to atomic > always taken. > B) Call "commit" API, and regain the performance as per previous DPDK > versions. > > Thoughts/opinions on the above? I've flagged the rest of the patchset for > review ASAP. Regards, -Harry > > > lib/librte_eal/common/rte_service.c | 11 +++++------ > > 1 file changed, 5 insertions(+), 6 deletions(-) > > > > diff --git a/lib/librte_eal/common/rte_service.c > > b/lib/librte_eal/common/rte_service.c > > index 70d17a5..b8c465e 100644 > > --- a/lib/librte_eal/common/rte_service.c > > +++ b/lib/librte_eal/common/rte_service.c > > @@ -50,6 +50,10 @@ struct rte_service_spec_impl { > > uint8_t internal_flags; > > > > /* per service statistics */ > > + /* Indicates how many cores the service is mapped to run on. > > + * It does not indicate the number of cores the service is running > > + * on currently. > > + */ > > rte_atomic32_t num_mapped_cores; > > uint64_t calls; > > uint64_t cycles_spent; > > @@ -370,12 +374,7 @@ service_run(uint32_t i, struct core_state *cs, > > uint64_t service_mask, > > > > cs->service_active_on_lcore[i] = 1; > > > > - /* check do we need cmpset, if MT safe or <= 1 core > > - * mapped, atomic ops are not required. > > - */ > > - const int use_atomics = (service_mt_safe(s) == 0) && > > - (rte_atomic32_read(&s- > >num_mapped_cores) > 1); > > - if (use_atomics) { > > + if (service_mt_safe(s) == 0) { > > if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1)) > > return -EBUSY; > > > > -- > > 2.7.4
> -----Original Message----- > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> > Sent: Wednesday, April 29, 2020 11:49 PM > To: Van Haaren, Harry <harry.van.haaren@intel.com>; Phil Yang > <Phil.Yang@arm.com>; dev@dpdk.org > Cc: thomas@monjalon.net; david.marchand@redhat.com; Ananyev, Konstantin > <konstantin.ananyev@intel.com>; jerinj@marvell.com; > hemant.agrawal@nxp.com; Gavin Hu <Gavin.Hu@arm.com>; nd > <nd@arm.com>; stable@dpdk.org; Eads, Gage <gage.eads@intel.com>; > Richardson, Bruce <bruce.richardson@intel.com>; Honnappa Nagarahalli > <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com> > Subject: RE: [PATCH v2 1/6] service: fix race condition for MT unsafe service > > Hi Harry, > Thanks for getting back on this. > > <snip> > > > > Subject: [PATCH v2 1/6] service: fix race condition for MT unsafe > > > service > > > > > > From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > > > > > The MT unsafe service might get configured to run on another core > > > while the service is running currently. This might result in the MT > > > unsafe service running on multiple cores simultaneously. Use > > > 'execute_lock' always when the service is MT unsafe. > > > > > > Fixes: e9139a32f6e8 ("service: add function to run on app lcore") > > > Cc: stable@dpdk.org > > > > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > > Reviewed-by: Phil Yang <phil.yang@arm.com> > > > --- > > > > Thanks for spinning a new revision - based on ML discussion previously, it > > seems like the "use service-run-count" to avoid this race would be a complex > > solution. > > > > Suggesting the following; > > 1) Take the approach as per this patch, to always take the atomic, fixing the > > race condition. > Ok I've micro-benchmarked this code change inside the service cores autotest, and it introduces around 35 cycles of overhead per service call. This is not ideal, but given it's a bugfix, and by far the simplest method to fix this race-condition. Having discussed and investigated multiple other solutions, I believe this is the right solution. Thanks Honnappa and Phil for identifying and driving a solution. I suggest to post the benchmarking unit-test addition patch, and integrate that *before* the series under review here gets merged? This makes benchmarking of the "before bugfix" performance in future easier should it be required. > > 2) Add an API to service-cores, which allows "committing" of mappings. > > Committing the mapping would imply that the mappings will not be changed > > in future. With runtime-remapping being removed from the equation, the > > existing branch-over-atomic optimization is valid again. > Ok. Just to make sure I understand this: > a) on the data plane, if commit API is called (probably a new state variable) and > num_mapped_cores is set to 1, there is no need to take the lock. > b) possible implementation of the commit API would check if > num_mapped_cores for the service is set to 1 and set a variable to indicate that > the lock is not required. > > What do you think about asking the application to set the service capability to > MT_SAFE if it knows that the service will run on a single core? This would > require us to change the documentation and does not require additional code. That's a nice idea - I like that if applications want to micro-optimize around the atomic, that they have a workaround/solution to do so, particularly that it doesn't require code-changes and backporting. Will send review and send feedback on the patches themselves. Regards, -Harry > > So this would offer applications two situations > > A) No application change: possible performance regression due to atomic > > always taken. > > B) Call "commit" API, and regain the performance as per previous DPDK > > versions. > > > > Thoughts/opinions on the above? I've flagged the rest of the patchset for > > review ASAP. Regards, -Harry > > > > > lib/librte_eal/common/rte_service.c | 11 +++++------ > > > 1 file changed, 5 insertions(+), 6 deletions(-) <snip patch changes>
<snip> > > > > > > Subject: [PATCH v2 1/6] service: fix race condition for MT unsafe > > > > service > > > > > > > > From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > > > > > > > The MT unsafe service might get configured to run on another core > > > > while the service is running currently. This might result in the > > > > MT unsafe service running on multiple cores simultaneously. Use > > > > 'execute_lock' always when the service is MT unsafe. > > > > > > > > Fixes: e9139a32f6e8 ("service: add function to run on app lcore") > > > > Cc: stable@dpdk.org > > > > > > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > > > Reviewed-by: Phil Yang <phil.yang@arm.com> > > > > --- > > > > > > Thanks for spinning a new revision - based on ML discussion > > > previously, it seems like the "use service-run-count" to avoid this > > > race would be a complex solution. > > > > > > Suggesting the following; > > > 1) Take the approach as per this patch, to always take the atomic, > > > fixing the race condition. > > Ok > > I've micro-benchmarked this code change inside the service cores autotest, > and it introduces around 35 cycles of overhead per service call. This is not > ideal, but given it's a bugfix, and by far the simplest method to fix this race- > condition. Having discussed and investigated multiple other solutions, I > believe this is the right solution. > Thanks Honnappa and Phil for identifying and driving a solution. You are welcome. Thank you for your timely responses. > > I suggest to post the benchmarking unit-test addition patch, and integrate > that > *before* the series under review here gets merged? This makes > benchmarking of the "before bugfix" performance in future easier should it be > required. I do not see any issues, would be happy to review. I think we still have time to catch up with RC2 (May 8th). You had also mentioned about calling out that, the control plane APIs are not MT safe. Should I add that to this patch? > > > > > 2) Add an API to service-cores, which allows "committing" of mappings. > > > Committing the mapping would imply that the mappings will not be > > > changed in future. With runtime-remapping being removed from the > > > equation, the existing branch-over-atomic optimization is valid again. > > Ok. Just to make sure I understand this: > > a) on the data plane, if commit API is called (probably a new state > > variable) and num_mapped_cores is set to 1, there is no need to take the > lock. > > b) possible implementation of the commit API would check if > > num_mapped_cores for the service is set to 1 and set a variable to > > indicate that the lock is not required. > > > > What do you think about asking the application to set the service > > capability to MT_SAFE if it knows that the service will run on a > > single core? This would require us to change the documentation and does > not require additional code. > > That's a nice idea - I like that if applications want to micro-optimize around > the atomic, that they have a workaround/solution to do so, particularly that it > doesn't require code-changes and backporting. > > Will send review and send feedback on the patches themselves. > Regards, -Harry > > > > So this would offer applications two situations > > > A) No application change: possible performance regression due to > > > atomic always taken. > > > B) Call "commit" API, and regain the performance as per previous > > > DPDK versions. > > > > > > Thoughts/opinions on the above? I've flagged the rest of the > > > patchset for review ASAP. Regards, -Harry > > > > > > > lib/librte_eal/common/rte_service.c | 11 +++++------ > > > > 1 file changed, 5 insertions(+), 6 deletions(-) > <snip patch changes>
> -----Original Message----- > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> > Sent: Friday, May 1, 2020 3:56 PM > To: Van Haaren, Harry <harry.van.haaren@intel.com>; Phil Yang > <Phil.Yang@arm.com>; dev@dpdk.org > Cc: thomas@monjalon.net; david.marchand@redhat.com; Ananyev, Konstantin > <konstantin.ananyev@intel.com>; jerinj@marvell.com; > hemant.agrawal@nxp.com; Gavin Hu <Gavin.Hu@arm.com>; nd > <nd@arm.com>; stable@dpdk.org; Eads, Gage <gage.eads@intel.com>; > Richardson, Bruce <bruce.richardson@intel.com>; nd <nd@arm.com>; > Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com> > Subject: RE: [PATCH v2 1/6] service: fix race condition for MT unsafe service > > <snip> > > > > > > > > Subject: [PATCH v2 1/6] service: fix race condition for MT unsafe > > > > > service > > > > > > > > > > From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > > > > > > > > > The MT unsafe service might get configured to run on another core > > > > > while the service is running currently. This might result in the > > > > > MT unsafe service running on multiple cores simultaneously. Use > > > > > 'execute_lock' always when the service is MT unsafe. > > > > > > > > > > Fixes: e9139a32f6e8 ("service: add function to run on app lcore") > > > > > Cc: stable@dpdk.org > > > > > > > > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > > > > Reviewed-by: Phil Yang <phil.yang@arm.com> > > > > > --- > > > > > > > > Thanks for spinning a new revision - based on ML discussion > > > > previously, it seems like the "use service-run-count" to avoid this > > > > race would be a complex solution. > > > > > > > > Suggesting the following; > > > > 1) Take the approach as per this patch, to always take the atomic, > > > > fixing the race condition. > > > Ok > > > > I've micro-benchmarked this code change inside the service cores autotest, > > and it introduces around 35 cycles of overhead per service call. This is not > > ideal, but given it's a bugfix, and by far the simplest method to fix this race- > > condition. Having discussed and investigated multiple other solutions, I > > believe this is the right solution. > > Thanks Honnappa and Phil for identifying and driving a solution. > You are welcome. Thank you for your timely responses. Perhaps not so timely after all ... I'll review C11 patches Tuesday morning, it's a long weekend in Ireland! > > I suggest to post the benchmarking unit-test addition patch, and integrate > > that > > *before* the series under review here gets merged? This makes > > benchmarking of the "before bugfix" performance in future easier should it be > > required. > I do not see any issues, would be happy to review. Thanks for volunteering, you're on CC when sent, for convenience: http://patches.dpdk.org/patch/69651/ > I think we still have time to catch up with RC2 (May 8th). Agree, merge into RC2 would be great. > You had also mentioned about calling out that, the control plane APIs are not > MT safe. Should I add that to this patch? Yes, that'd be great. <snip discussion details> Cheers, -Harry
diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c index 70d17a5..b8c465e 100644 --- a/lib/librte_eal/common/rte_service.c +++ b/lib/librte_eal/common/rte_service.c @@ -50,6 +50,10 @@ struct rte_service_spec_impl { uint8_t internal_flags; /* per service statistics */ + /* Indicates how many cores the service is mapped to run on. + * It does not indicate the number of cores the service is running + * on currently. + */ rte_atomic32_t num_mapped_cores; uint64_t calls; uint64_t cycles_spent; @@ -370,12 +374,7 @@ service_run(uint32_t i, struct core_state *cs, uint64_t service_mask, cs->service_active_on_lcore[i] = 1; - /* check do we need cmpset, if MT safe or <= 1 core - * mapped, atomic ops are not required. - */ - const int use_atomics = (service_mt_safe(s) == 0) && - (rte_atomic32_read(&s->num_mapped_cores) > 1); - if (use_atomics) { + if (service_mt_safe(s) == 0) { if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1)) return -EBUSY;