Message ID | 20181210211231.4084251-1-arnd@arndb.de |
---|---|
State | Accepted |
Commit | 852b2876a8a82f2c618da40b881bb07c2d42eb5c |
Headers | show |
Series | staging: vchiq: rework remove_event handling | expand |
On Mon, Dec 10, 2018 at 10:11:58PM +0100, Arnd Bergmann wrote: > Note that I'm changing it back to 'killable' after the previous patch > changed 'killable' to 'interruptible', apparently based on a misunderstanding > of the subtle down_interruptible() macro override in vchiq_killable.h. Oh wow... The non standard down_interruptible() macro is ugly. regards, dan carpenter
On Tue, 2018-12-11 at 13:07 +0300, Dan Carpenter wrote: > On Mon, Dec 10, 2018 at 10:11:58PM +0100, Arnd Bergmann wrote: > > Note that I'm changing it back to 'killable' after the previous > > patch > > changed 'killable' to 'interruptible', apparently based on a > > misunderstanding > > of the subtle down_interruptible() macro override in > > vchiq_killable.h. > > Oh wow... The non standard down_interruptible() macro is ugly. > Same reaction here, *_*. I wasn't even aware of it. Arnd, as long as it you're not doing it already. I'll prepare some extra fixes for the rest of completions and get rid of that file altogether. Regards, Nicolas
Hi Arnd, thanks for the patch! On Mon, 2018-12-10 at 22:11 +0100, Arnd Bergmann wrote: > I had started the removal of semaphores in this driver without > knowing > that Nicolas Saenz Julienne also worked on this. In case of the > "remote > event" infrastructure, my solution seemed significantly better, so > I'm > proposing this as a change on top. > > The problem with using either semaphores or completions here is that > it's an overly complex way of waking up a thread, and it looks like > the > 'count' of the semaphore can easily get out of sync, even though I > found > it hard to come up with a specific example. > > Changing it to a 'wait_queue_head_t' instead of a completion > simplifies > this by letting us wait directly on the 'event->fired' variable that > is > set by the videocore. > > Another simplification is passing the wait queue directly into the > helper > functions instead of going through the fragile logic of recording the > offset inside of a structure as part of a shared memory variable. > This > also avoids one uncached memory read and should be faster. > > Note that I'm changing it back to 'killable' after the previous patch > changed 'killable' to 'interruptible', apparently based on a > misunderstanding > of the subtle down_interruptible() macro override in > vchiq_killable.h. > > Fixes: f27e47bc6b8b ("staging: vchiq: use completions instead of > semaphores") > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > .../interface/vchiq_arm/vchiq_core.c | 63 +++++++-------- > ---- > .../interface/vchiq_arm/vchiq_core.h | 12 ++-- > 2 files changed, 30 insertions(+), 45 deletions(-) > > diff --git > a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c > b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c > index 482b5daf6c0c..eda3004a0c6a 100644 > --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c > +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c > @@ -417,26 +417,23 @@ vchiq_set_conn_state(VCHIQ_STATE_T *state, > VCHIQ_CONNSTATE_T newstate) > } > > static inline void > -remote_event_create(REMOTE_EVENT_T *event) > +remote_event_create(wait_queue_head_t *wq, REMOTE_EVENT_T *event) > { > event->armed = 0; > /* Don't clear the 'fired' flag because it may already have > been set > ** by the other side. */ > + init_waitqueue_head(wq); > } > > static inline int > -remote_event_wait(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event) > +remote_event_wait(wait_queue_head_t *wq, REMOTE_EVENT_T *event) > { > if (!event->fired) { > event->armed = 1; > dsb(sy); > - if (!event->fired) { > - if (wait_for_completion_interruptible( > - (struct completion *) > - ((char *)state + event- > >event))) { > - event->armed = 0; > - return 0; > - } > + if (wait_event_killable(*wq, event->fired)) { > + event->armed = 0; > + return 0; > } > event->armed = 0; > wmb(); > @@ -447,26 +444,26 @@ remote_event_wait(VCHIQ_STATE_T *state, > REMOTE_EVENT_T *event) > } > > static inline void > -remote_event_signal_local(VCHIQ_STATE_T *state, REMOTE_EVENT_T > *event) > +remote_event_signal_local(wait_queue_head_t *wq, REMOTE_EVENT_T > *event) > { > event->armed = 0; > - complete((struct completion *)((char *)state + event->event)); > + wake_up_all(wq); Shouldn't this just be "wake_up(wq)"? Regards, Nicolas
On Tue, Dec 11, 2018 at 1:31 PM Nicolas Saenz Julienne <nsaenzjulienne@suse.de> wrote: > > On Tue, 2018-12-11 at 13:07 +0300, Dan Carpenter wrote: > > On Mon, Dec 10, 2018 at 10:11:58PM +0100, Arnd Bergmann wrote: > > > Note that I'm changing it back to 'killable' after the previous > > > patch > > > changed 'killable' to 'interruptible', apparently based on a > > > misunderstanding > > > of the subtle down_interruptible() macro override in > > > vchiq_killable.h. > > > > Oh wow... The non standard down_interruptible() macro is ugly. > > > > Same reaction here, *_*. I wasn't even aware of it. > > Arnd, as long as it you're not doing it already. I'll prepare some > extra fixes for the rest of completions and get rid of that file > altogether. Ok, please do that, and add me to Cc. I think there is one counting semaphore left (g_free_fragments_sema), which could just use the plain down_killable() before you remove the wrapper. I actually have another (longer) patch series that is not entirely ready to convert all counting semaphores into a new 'struct counting_semaphore' to split them away from the semaphores that can be converted into mutexes, completions or other replacements. Arnd
On Tue, Dec 11, 2018 at 1:36 PM Nicolas Saenz Julienne <nsaenzjulienne@suse.de> wrote: > On Mon, 2018-12-10 at 22:11 +0100, Arnd Bergmann wrote: > > @@ -447,26 +444,26 @@ remote_event_wait(VCHIQ_STATE_T *state, > > REMOTE_EVENT_T *event) > > } > > > > static inline void > > -remote_event_signal_local(VCHIQ_STATE_T *state, REMOTE_EVENT_T > > *event) > > +remote_event_signal_local(wait_queue_head_t *wq, REMOTE_EVENT_T > > *event) > > { > > event->armed = 0; > > - complete((struct completion *)((char *)state + event->event)); > > + wake_up_all(wq); > > Shouldn't this just be "wake_up(wq)"? I wasn't entirely sure if we could get with more than one thread waiting for the wakeup. With the semaphore or completion that would already be broken because we'd only wake up one of them, but I was hoping to stay on the safe side with wake_up_all(). Arnd
On Tue, 2018-12-11 at 15:20 +0100, Arnd Bergmann wrote: > On Tue, Dec 11, 2018 at 1:36 PM Nicolas Saenz Julienne > <nsaenzjulienne@suse.de> wrote: > > On Mon, 2018-12-10 at 22:11 +0100, Arnd Bergmann wrote: > > > @@ -447,26 +444,26 @@ remote_event_wait(VCHIQ_STATE_T *state, > > > REMOTE_EVENT_T *event) > > > } > > > > > > static inline void > > > -remote_event_signal_local(VCHIQ_STATE_T *state, REMOTE_EVENT_T > > > *event) > > > +remote_event_signal_local(wait_queue_head_t *wq, REMOTE_EVENT_T > > > *event) > > > { > > > event->armed = 0; > > > - complete((struct completion *)((char *)state + event- > > > >event)); > > > + wake_up_all(wq); > > > > Shouldn't this just be "wake_up(wq)"? > > I wasn't entirely sure if we could get with more than one thread > waiting > for the wakeup. With the semaphore or completion that would already > be broken because we'd only wake up one of them, but I was hoping > to stay on the safe side with wake_up_all(). You're right. Had a look at the code and there shouldn't be more than one thread waiting. wake_up_all() looks OK. Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Regards, Nicolas
diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c index 482b5daf6c0c..eda3004a0c6a 100644 --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c @@ -417,26 +417,23 @@ vchiq_set_conn_state(VCHIQ_STATE_T *state, VCHIQ_CONNSTATE_T newstate) } static inline void -remote_event_create(REMOTE_EVENT_T *event) +remote_event_create(wait_queue_head_t *wq, REMOTE_EVENT_T *event) { event->armed = 0; /* Don't clear the 'fired' flag because it may already have been set ** by the other side. */ + init_waitqueue_head(wq); } static inline int -remote_event_wait(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event) +remote_event_wait(wait_queue_head_t *wq, REMOTE_EVENT_T *event) { if (!event->fired) { event->armed = 1; dsb(sy); - if (!event->fired) { - if (wait_for_completion_interruptible( - (struct completion *) - ((char *)state + event->event))) { - event->armed = 0; - return 0; - } + if (wait_event_killable(*wq, event->fired)) { + event->armed = 0; + return 0; } event->armed = 0; wmb(); @@ -447,26 +444,26 @@ remote_event_wait(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event) } static inline void -remote_event_signal_local(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event) +remote_event_signal_local(wait_queue_head_t *wq, REMOTE_EVENT_T *event) { event->armed = 0; - complete((struct completion *)((char *)state + event->event)); + wake_up_all(wq); } static inline void -remote_event_poll(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event) +remote_event_poll(wait_queue_head_t *wq, REMOTE_EVENT_T *event) { if (event->fired && event->armed) - remote_event_signal_local(state, event); + remote_event_signal_local(wq, event); } void remote_event_pollall(VCHIQ_STATE_T *state) { - remote_event_poll(state, &state->local->sync_trigger); - remote_event_poll(state, &state->local->sync_release); - remote_event_poll(state, &state->local->trigger); - remote_event_poll(state, &state->local->recycle); + remote_event_poll(&state->sync_trigger_event, &state->local->sync_trigger); + remote_event_poll(&state->sync_release_event, &state->local->sync_release); + remote_event_poll(&state->trigger_event, &state->local->trigger); + remote_event_poll(&state->recycle_event, &state->local->recycle); } /* Round up message sizes so that any space at the end of a slot is always big @@ -550,7 +547,7 @@ request_poll(VCHIQ_STATE_T *state, VCHIQ_SERVICE_T *service, int poll_type) wmb(); /* ... and ensure the slot handler runs. */ - remote_event_signal_local(state, &state->local->trigger); + remote_event_signal_local(&state->trigger_event, &state->local->trigger); } /* Called from queue_message, by the slot handler and application threads, @@ -1069,7 +1066,7 @@ queue_message_sync(VCHIQ_STATE_T *state, VCHIQ_SERVICE_T *service, (mutex_lock_killable(&state->sync_mutex) != 0)) return VCHIQ_RETRY; - remote_event_wait(state, &local->sync_release); + remote_event_wait(&state->sync_release_event, &local->sync_release); rmb(); @@ -1887,7 +1884,7 @@ slot_handler_func(void *v) while (1) { DEBUG_COUNT(SLOT_HANDLER_COUNT); DEBUG_TRACE(SLOT_HANDLER_LINE); - remote_event_wait(state, &local->trigger); + remote_event_wait(&state->trigger_event, &local->trigger); rmb(); @@ -1976,7 +1973,7 @@ recycle_func(void *v) return -ENOMEM; while (1) { - remote_event_wait(state, &local->recycle); + remote_event_wait(&state->recycle_event, &local->recycle); process_free_queue(state, found, length); } @@ -1998,7 +1995,7 @@ sync_func(void *v) int type; unsigned int localport, remoteport; - remote_event_wait(state, &local->sync_trigger); + remote_event_wait(&state->sync_trigger_event, &local->sync_trigger); rmb(); @@ -2193,11 +2190,6 @@ vchiq_init_state(VCHIQ_STATE_T *state, VCHIQ_SLOT_ZERO_T *slot_zero) init_completion(&state->connect); mutex_init(&state->mutex); - init_completion(&state->trigger_event); - init_completion(&state->recycle_event); - init_completion(&state->sync_trigger_event); - init_completion(&state->sync_release_event); - mutex_init(&state->slot_mutex); mutex_init(&state->recycle_mutex); mutex_init(&state->sync_mutex); @@ -2229,24 +2221,17 @@ vchiq_init_state(VCHIQ_STATE_T *state, VCHIQ_SLOT_ZERO_T *slot_zero) state->data_use_count = 0; state->data_quota = state->slot_queue_available - 1; - local->trigger.event = offsetof(VCHIQ_STATE_T, trigger_event); - remote_event_create(&local->trigger); + remote_event_create(&state->trigger_event, &local->trigger); local->tx_pos = 0; - - local->recycle.event = offsetof(VCHIQ_STATE_T, recycle_event); - remote_event_create(&local->recycle); + remote_event_create(&state->recycle_event, &local->recycle); local->slot_queue_recycle = state->slot_queue_available; - - local->sync_trigger.event = offsetof(VCHIQ_STATE_T, sync_trigger_event); - remote_event_create(&local->sync_trigger); - - local->sync_release.event = offsetof(VCHIQ_STATE_T, sync_release_event); - remote_event_create(&local->sync_release); + remote_event_create(&state->sync_trigger_event, &local->sync_trigger); + remote_event_create(&state->sync_release_event, &local->sync_release); /* At start-of-day, the slot is empty and available */ ((VCHIQ_HEADER_T *)SLOT_DATA_FROM_INDEX(state, local->slot_sync))->msgid = VCHIQ_MSGID_PADDING; - remote_event_signal_local(state, &local->sync_release); + remote_event_signal_local(&state->sync_release_event, &local->sync_release); local->debug[DEBUG_ENTRIES] = DEBUG_MAX; diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h index b76281f7510e..aae2c59700bd 100644 --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h @@ -37,6 +37,7 @@ #include <linux/mutex.h> #include <linux/completion.h> #include <linux/kthread.h> +#include <linux/wait.h> #include "vchiq_cfg.h" @@ -262,8 +263,7 @@ typedef struct vchiq_bulk_queue_struct { typedef struct remote_event_struct { int armed; int fired; - /* Contains offset from the beginning of the VCHIQ_STATE_T structure */ - u32 event; + u32 __unused; } REMOTE_EVENT_T; typedef struct opaque_platform_state_t *VCHIQ_PLATFORM_STATE_T; @@ -426,16 +426,16 @@ struct vchiq_state_struct { struct task_struct *sync_thread; /* Local implementation of the trigger remote event */ - struct completion trigger_event; + wait_queue_head_t trigger_event; /* Local implementation of the recycle remote event */ - struct completion recycle_event; + wait_queue_head_t recycle_event; /* Local implementation of the sync trigger remote event */ - struct completion sync_trigger_event; + wait_queue_head_t sync_trigger_event; /* Local implementation of the sync release remote event */ - struct completion sync_release_event; + wait_queue_head_t sync_release_event; char *tx_data; char *rx_data;
I had started the removal of semaphores in this driver without knowing that Nicolas Saenz Julienne also worked on this. In case of the "remote event" infrastructure, my solution seemed significantly better, so I'm proposing this as a change on top. The problem with using either semaphores or completions here is that it's an overly complex way of waking up a thread, and it looks like the 'count' of the semaphore can easily get out of sync, even though I found it hard to come up with a specific example. Changing it to a 'wait_queue_head_t' instead of a completion simplifies this by letting us wait directly on the 'event->fired' variable that is set by the videocore. Another simplification is passing the wait queue directly into the helper functions instead of going through the fragile logic of recording the offset inside of a structure as part of a shared memory variable. This also avoids one uncached memory read and should be faster. Note that I'm changing it back to 'killable' after the previous patch changed 'killable' to 'interruptible', apparently based on a misunderstanding of the subtle down_interruptible() macro override in vchiq_killable.h. Fixes: f27e47bc6b8b ("staging: vchiq: use completions instead of semaphores") Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- .../interface/vchiq_arm/vchiq_core.c | 63 +++++++------------ .../interface/vchiq_arm/vchiq_core.h | 12 ++-- 2 files changed, 30 insertions(+), 45 deletions(-) -- 2.20.0