Message ID | 1596781478-12216-2-git-send-email-mansur@codeaurora.org |
---|---|
State | Superseded |
Headers | show |
Series | [RESEND,1/3] venus: core: handle race condititon for core ops | expand |
Hi Mansur, On 8/7/20 9:24 AM, Mansur Alisha Shaik wrote: > For core ops we are having only write protect but there > is no read protect, because of this in multthreading > and concurrency, one CPU core is reading without wait > which is causing the NULL pointer dereferece crash. > > one such scenario is as show below, where in one > core core->ops becoming NULL and in another core > calling core->ops->session_init(). > > CPU: core-7: > Call trace: > hfi_session_init+0x180/0x1dc [venus_core] I thought more on this issue. I think we have to return error from hfi_session_init() in the case when the driver is in system-error-handler. Infact all userspace ioctls must end up with error while we are in recovery state. What do you think? > vdec_queue_setup+0x9c/0x364 [venus_dec] > vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common] > vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2] > v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem] > v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem] > v4l_reqbufs+0x4c/0x5c > __video_do_ioctl+0x2b0/0x39c > > CPU: core-0: > Call trace: > venus_shutdown+0x98/0xfc [venus_core] > venus_sys_error_handler+0x64/0x148 [venus_core] > process_one_work+0x210/0x3d0 > worker_thread+0x248/0x3f4 > kthread+0x11c/0x12c > > Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org> > --- > drivers/media/platform/qcom/venus/core.c | 2 +- > drivers/media/platform/qcom/venus/hfi.c | 5 ++++- > 2 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c > index 203c653..fe99c83 100644 > --- a/drivers/media/platform/qcom/venus/core.c > +++ b/drivers/media/platform/qcom/venus/core.c > @@ -64,8 +64,8 @@ static void venus_sys_error_handler(struct work_struct *work) > pm_runtime_get_sync(core->dev); > > hfi_core_deinit(core, true); > - hfi_destroy(core); > mutex_lock(&core->lock); > + hfi_destroy(core); > venus_shutdown(core); > > pm_runtime_put_sync(core->dev); > diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c > index a211eb9..2eeb31f 100644 > --- a/drivers/media/platform/qcom/venus/hfi.c > +++ b/drivers/media/platform/qcom/venus/hfi.c > @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create); > int hfi_session_init(struct venus_inst *inst, u32 pixfmt) > { > struct venus_core *core = inst->core; > - const struct hfi_ops *ops = core->ops; > + const struct hfi_ops *ops; > int ret; > > if (inst->state != INST_UNINIT) > @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt) > inst->hfi_codec = to_codec_type(pixfmt); > reinit_completion(&inst->done); > > + mutex_lock(&core->lock); > + ops = core->ops; > ret = ops->session_init(inst, inst->session_type, inst->hfi_codec); > if (ret) > return ret; > > + mutex_unlock(&core->lock); > ret = wait_session_msg(inst); > if (ret) > return ret; >
diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c index 203c653..fe99c83 100644 --- a/drivers/media/platform/qcom/venus/core.c +++ b/drivers/media/platform/qcom/venus/core.c @@ -64,8 +64,8 @@ static void venus_sys_error_handler(struct work_struct *work) pm_runtime_get_sync(core->dev); hfi_core_deinit(core, true); - hfi_destroy(core); mutex_lock(&core->lock); + hfi_destroy(core); venus_shutdown(core); pm_runtime_put_sync(core->dev); diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c index a211eb9..2eeb31f 100644 --- a/drivers/media/platform/qcom/venus/hfi.c +++ b/drivers/media/platform/qcom/venus/hfi.c @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create); int hfi_session_init(struct venus_inst *inst, u32 pixfmt) { struct venus_core *core = inst->core; - const struct hfi_ops *ops = core->ops; + const struct hfi_ops *ops; int ret; if (inst->state != INST_UNINIT) @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt) inst->hfi_codec = to_codec_type(pixfmt); reinit_completion(&inst->done); + mutex_lock(&core->lock); + ops = core->ops; ret = ops->session_init(inst, inst->session_type, inst->hfi_codec); if (ret) return ret; + mutex_unlock(&core->lock); ret = wait_session_msg(inst); if (ret) return ret;
For core ops we are having only write protect but there is no read protect, because of this in multthreading and concurrency, one CPU core is reading without wait which is causing the NULL pointer dereferece crash. one such scenario is as show below, where in one core core->ops becoming NULL and in another core calling core->ops->session_init(). CPU: core-7: Call trace: hfi_session_init+0x180/0x1dc [venus_core] vdec_queue_setup+0x9c/0x364 [venus_dec] vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common] vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2] v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem] v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem] v4l_reqbufs+0x4c/0x5c __video_do_ioctl+0x2b0/0x39c CPU: core-0: Call trace: venus_shutdown+0x98/0xfc [venus_core] venus_sys_error_handler+0x64/0x148 [venus_core] process_one_work+0x210/0x3d0 worker_thread+0x248/0x3f4 kthread+0x11c/0x12c Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org> --- drivers/media/platform/qcom/venus/core.c | 2 +- drivers/media/platform/qcom/venus/hfi.c | 5 ++++- 2 files changed, 5 insertions(+), 2 deletions(-)