diff mbox series

[2/3] locking/barriers: Kill lockless_dereference

Message ID 1507818377-7546-3-git-send-email-will.deacon@arm.com
State New
Headers show
Series Remove lockless_dereference | expand

Commit Message

Will Deacon Oct. 12, 2017, 2:26 p.m. UTC
lockless_dereference is a nice idea, but its gained little traction in
kernel code since it's introduction three years ago. This is partly
because it's a pain to type, but also because using READ_ONCE instead
will work correctly on all architectures apart from Alpha, which is a
fully supported but somewhat niche architecture these days.

This patch moves smp_read_barrier_depends() (a NOP on all architectures
other than Alpha) from lockless_dereference into READ_ONCE, converts
the few actual users over to READ_ONCE and then finally removes
lockless_dereference altogether.

Signed-off-by: Will Deacon <will.deacon@arm.com>

---
 Documentation/memory-barriers.txt                   | 12 ------------
 .../translations/ko_KR/memory-barriers.txt          | 12 ------------
 arch/x86/events/core.c                              |  2 +-
 arch/x86/include/asm/mmu_context.h                  |  4 ++--
 arch/x86/kernel/ldt.c                               |  2 +-
 drivers/md/dm-mpath.c                               | 20 ++++++++++----------
 fs/dcache.c                                         |  4 ++--
 fs/overlayfs/ovl_entry.h                            |  2 +-
 fs/overlayfs/readdir.c                              |  2 +-
 include/linux/compiler.h                            | 21 +--------------------
 include/linux/rculist.h                             |  4 ++--
 include/linux/rcupdate.h                            |  4 ++--
 kernel/events/core.c                                |  4 ++--
 kernel/seccomp.c                                    |  2 +-
 kernel/task_work.c                                  |  2 +-
 mm/slab.h                                           |  2 +-
 16 files changed, 28 insertions(+), 71 deletions(-)

-- 
2.1.4

Comments

Paul E. McKenney Oct. 12, 2017, 3:16 p.m. UTC | #1
On Thu, Oct 12, 2017 at 03:26:16PM +0100, Will Deacon wrote:
> lockless_dereference is a nice idea, but its gained little traction in

> kernel code since it's introduction three years ago. This is partly

> because it's a pain to type, but also because using READ_ONCE instead

> will work correctly on all architectures apart from Alpha, which is a

> fully supported but somewhat niche architecture these days.

> 

> This patch moves smp_read_barrier_depends() (a NOP on all architectures

> other than Alpha) from lockless_dereference into READ_ONCE, converts

> the few actual users over to READ_ONCE and then finally removes

> lockless_dereference altogether.

> 

> Signed-off-by: Will Deacon <will.deacon@arm.com>


Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>


> ---

>  Documentation/memory-barriers.txt                   | 12 ------------

>  .../translations/ko_KR/memory-barriers.txt          | 12 ------------

>  arch/x86/events/core.c                              |  2 +-

>  arch/x86/include/asm/mmu_context.h                  |  4 ++--

>  arch/x86/kernel/ldt.c                               |  2 +-

>  drivers/md/dm-mpath.c                               | 20 ++++++++++----------

>  fs/dcache.c                                         |  4 ++--

>  fs/overlayfs/ovl_entry.h                            |  2 +-

>  fs/overlayfs/readdir.c                              |  2 +-

>  include/linux/compiler.h                            | 21 +--------------------

>  include/linux/rculist.h                             |  4 ++--

>  include/linux/rcupdate.h                            |  4 ++--

>  kernel/events/core.c                                |  4 ++--

>  kernel/seccomp.c                                    |  2 +-

>  kernel/task_work.c                                  |  2 +-

>  mm/slab.h                                           |  2 +-

>  16 files changed, 28 insertions(+), 71 deletions(-)

> 

> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt

> index b759a60624fd..470a682f3fa4 100644

> --- a/Documentation/memory-barriers.txt

> +++ b/Documentation/memory-barriers.txt

> @@ -1886,18 +1886,6 @@ There are some more advanced barrier functions:

>       See Documentation/atomic_{t,bitops}.txt for more information.

> 

> 

> - (*) lockless_dereference();

> -

> -     This can be thought of as a pointer-fetch wrapper around the

> -     smp_read_barrier_depends() data-dependency barrier.

> -

> -     This is also similar to rcu_dereference(), but in cases where

> -     object lifetime is handled by some mechanism other than RCU, for

> -     example, when the objects removed only when the system goes down.

> -     In addition, lockless_dereference() is used in some data structures

> -     that can be used both with and without RCU.

> -

> -

>   (*) dma_wmb();

>   (*) dma_rmb();

> 

> diff --git a/Documentation/translations/ko_KR/memory-barriers.txt b/Documentation/translations/ko_KR/memory-barriers.txt

> index a7a813258013..ec3b46e27b7a 100644

> --- a/Documentation/translations/ko_KR/memory-barriers.txt

> +++ b/Documentation/translations/ko_KR/memory-barriers.txt

> @@ -1858,18 +1858,6 @@ Mandatory 배리어들은 SMP 시스템에서도 UP 시스템에서도 SMP 효

>       참고하세요.

> 

> 

> - (*) lockless_dereference();

> -

> -     이 함수는 smp_read_barrier_depends() 데이터 의존성 배리어를 사용하는

> -     포인터 읽어오기 래퍼(wrapper) 함수로 생각될 수 있습니다.

> -

> -     객체의 라이프타임이 RCU 외의 메커니즘으로 관리된다는 점을 제외하면

> -     rcu_dereference() 와도 유사한데, 예를 들면 객체가 시스템이 꺼질 때에만

> -     제거되는 경우 등입니다.  또한, lockless_dereference() 은 RCU 와 함께

> -     사용될수도, RCU 없이 사용될 수도 있는 일부 데이터 구조에 사용되고

> -     있습니다.

> -

> -

>   (*) dma_wmb();

>   (*) dma_rmb();

> 

> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c

> index 80534d3c2480..589af1eec7c1 100644

> --- a/arch/x86/events/core.c

> +++ b/arch/x86/events/core.c

> @@ -2371,7 +2371,7 @@ static unsigned long get_segment_base(unsigned int segment)

>  		struct ldt_struct *ldt;

> 

>  		/* IRQs are off, so this synchronizes with smp_store_release */

> -		ldt = lockless_dereference(current->active_mm->context.ldt);

> +		ldt = READ_ONCE(current->active_mm->context.ldt);

>  		if (!ldt || idx >= ldt->nr_entries)

>  			return 0;

> 

> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h

> index c120b5db178a..9037a4e546e8 100644

> --- a/arch/x86/include/asm/mmu_context.h

> +++ b/arch/x86/include/asm/mmu_context.h

> @@ -72,8 +72,8 @@ static inline void load_mm_ldt(struct mm_struct *mm)

>  #ifdef CONFIG_MODIFY_LDT_SYSCALL

>  	struct ldt_struct *ldt;

> 

> -	/* lockless_dereference synchronizes with smp_store_release */

> -	ldt = lockless_dereference(mm->context.ldt);

> +	/* READ_ONCE synchronizes with smp_store_release */

> +	ldt = READ_ONCE(mm->context.ldt);

> 

>  	/*

>  	 * Any change to mm->context.ldt is followed by an IPI to all

> diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c

> index f0e64db18ac8..0a21390642c4 100644

> --- a/arch/x86/kernel/ldt.c

> +++ b/arch/x86/kernel/ldt.c

> @@ -101,7 +101,7 @@ static void finalize_ldt_struct(struct ldt_struct *ldt)

>  static void install_ldt(struct mm_struct *current_mm,

>  			struct ldt_struct *ldt)

>  {

> -	/* Synchronizes with lockless_dereference in load_mm_ldt. */

> +	/* Synchronizes with READ_ONCE in load_mm_ldt. */

>  	smp_store_release(&current_mm->context.ldt, ldt);

> 

>  	/* Activate the LDT for all CPUs using current_mm. */

> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c

> index 11f273d2f018..3f88c9d32f7e 100644

> --- a/drivers/md/dm-mpath.c

> +++ b/drivers/md/dm-mpath.c

> @@ -366,7 +366,7 @@ static struct pgpath *choose_path_in_pg(struct multipath *m,

> 

>  	pgpath = path_to_pgpath(path);

> 

> -	if (unlikely(lockless_dereference(m->current_pg) != pg)) {

> +	if (unlikely(READ_ONCE(m->current_pg) != pg)) {

>  		/* Only update current_pgpath if pg changed */

>  		spin_lock_irqsave(&m->lock, flags);

>  		m->current_pgpath = pgpath;

> @@ -390,7 +390,7 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes)

>  	}

> 

>  	/* Were we instructed to switch PG? */

> -	if (lockless_dereference(m->next_pg)) {

> +	if (READ_ONCE(m->next_pg)) {

>  		spin_lock_irqsave(&m->lock, flags);

>  		pg = m->next_pg;

>  		if (!pg) {

> @@ -406,7 +406,7 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes)

> 

>  	/* Don't change PG until it has no remaining paths */

>  check_current_pg:

> -	pg = lockless_dereference(m->current_pg);

> +	pg = READ_ONCE(m->current_pg);

>  	if (pg) {

>  		pgpath = choose_path_in_pg(m, pg, nr_bytes);

>  		if (!IS_ERR_OR_NULL(pgpath))

> @@ -473,7 +473,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,

>  	struct request *clone;

> 

>  	/* Do we need to select a new pgpath? */

> -	pgpath = lockless_dereference(m->current_pgpath);

> +	pgpath = READ_ONCE(m->current_pgpath);

>  	if (!pgpath || !test_bit(MPATHF_QUEUE_IO, &m->flags))

>  		pgpath = choose_pgpath(m, nr_bytes);

> 

> @@ -535,7 +535,7 @@ static int __multipath_map_bio(struct multipath *m, struct bio *bio, struct dm_m

>  	bool queue_io;

> 

>  	/* Do we need to select a new pgpath? */

> -	pgpath = lockless_dereference(m->current_pgpath);

> +	pgpath = READ_ONCE(m->current_pgpath);

>  	queue_io = test_bit(MPATHF_QUEUE_IO, &m->flags);

>  	if (!pgpath || !queue_io)

>  		pgpath = choose_pgpath(m, nr_bytes);

> @@ -1804,7 +1804,7 @@ static int multipath_prepare_ioctl(struct dm_target *ti,

>  	struct pgpath *current_pgpath;

>  	int r;

> 

> -	current_pgpath = lockless_dereference(m->current_pgpath);

> +	current_pgpath = READ_ONCE(m->current_pgpath);

>  	if (!current_pgpath)

>  		current_pgpath = choose_pgpath(m, 0);

> 

> @@ -1826,7 +1826,7 @@ static int multipath_prepare_ioctl(struct dm_target *ti,

>  	}

> 

>  	if (r == -ENOTCONN) {

> -		if (!lockless_dereference(m->current_pg)) {

> +		if (!READ_ONCE(m->current_pg)) {

>  			/* Path status changed, redo selection */

>  			(void) choose_pgpath(m, 0);

>  		}

> @@ -1895,9 +1895,9 @@ static int multipath_busy(struct dm_target *ti)

>  		return (m->queue_mode != DM_TYPE_MQ_REQUEST_BASED);

> 

>  	/* Guess which priority_group will be used at next mapping time */

> -	pg = lockless_dereference(m->current_pg);

> -	next_pg = lockless_dereference(m->next_pg);

> -	if (unlikely(!lockless_dereference(m->current_pgpath) && next_pg))

> +	pg = READ_ONCE(m->current_pg);

> +	next_pg = READ_ONCE(m->next_pg);

> +	if (unlikely(!READ_ONCE(m->current_pgpath) && next_pg))

>  		pg = next_pg;

> 

>  	if (!pg) {

> diff --git a/fs/dcache.c b/fs/dcache.c

> index f90141387f01..34c852af215c 100644

> --- a/fs/dcache.c

> +++ b/fs/dcache.c

> @@ -231,7 +231,7 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c

>  {

>  	/*

>  	 * Be careful about RCU walk racing with rename:

> -	 * use 'lockless_dereference' to fetch the name pointer.

> +	 * use 'READ_ONCE' to fetch the name pointer.

>  	 *

>  	 * NOTE! Even if a rename will mean that the length

>  	 * was not loaded atomically, we don't care. The

> @@ -245,7 +245,7 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c

>  	 * early because the data cannot match (there can

>  	 * be no NUL in the ct/tcount data)

>  	 */

> -	const unsigned char *cs = lockless_dereference(dentry->d_name.name);

> +	const unsigned char *cs = READ_ONCE(dentry->d_name.name);

> 

>  	return dentry_string_cmp(cs, ct, tcount);

>  }

> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h

> index 25d9b5adcd42..36b49bd09264 100644

> --- a/fs/overlayfs/ovl_entry.h

> +++ b/fs/overlayfs/ovl_entry.h

> @@ -77,5 +77,5 @@ static inline struct ovl_inode *OVL_I(struct inode *inode)

> 

>  static inline struct dentry *ovl_upperdentry_dereference(struct ovl_inode *oi)

>  {

> -	return lockless_dereference(oi->__upperdentry);

> +	return READ_ONCE(oi->__upperdentry);

>  }

> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c

> index 0f85ee9c3268..c67a7703296b 100644

> --- a/fs/overlayfs/readdir.c

> +++ b/fs/overlayfs/readdir.c

> @@ -754,7 +754,7 @@ static int ovl_dir_fsync(struct file *file, loff_t start, loff_t end,

>  	if (!od->is_upper && OVL_TYPE_UPPER(ovl_path_type(dentry))) {

>  		struct inode *inode = file_inode(file);

> 

> -		realfile = lockless_dereference(od->upperfile);

> +		realfile = READ_ONCE(od->upperfile);

>  		if (!realfile) {

>  			struct path upperpath;

> 

> diff --git a/include/linux/compiler.h b/include/linux/compiler.h

> index c714239f453f..4517662f128c 100644

> --- a/include/linux/compiler.h

> +++ b/include/linux/compiler.h

> @@ -242,6 +242,7 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s

>  		__read_once_size(&(x), __u.__c, sizeof(x));		\

>  	else								\

>  		__read_once_size_nocheck(&(x), __u.__c, sizeof(x));	\

> +	smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \

>  	__u.__val;							\

>  })

>  #define READ_ONCE(x) __READ_ONCE(x, 1)

> @@ -345,24 +346,4 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s

>  	(volatile typeof(x) *)&(x); })

>  #define ACCESS_ONCE(x) (*__ACCESS_ONCE(x))

> 

> -/**

> - * lockless_dereference() - safely load a pointer for later dereference

> - * @p: The pointer to load

> - *

> - * Similar to rcu_dereference(), but for situations where the pointed-to

> - * object's lifetime is managed by something other than RCU.  That

> - * "something other" might be reference counting or simple immortality.

> - *

> - * The seemingly unused variable ___typecheck_p validates that @p is

> - * indeed a pointer type by using a pointer to typeof(*p) as the type.

> - * Taking a pointer to typeof(*p) again is needed in case p is void *.

> - */

> -#define lockless_dereference(p) \

> -({ \

> -	typeof(p) _________p1 = READ_ONCE(p); \

> -	typeof(*(p)) *___typecheck_p __maybe_unused; \

> -	smp_read_barrier_depends(); /* Dependency order vs. p above. */ \

> -	(_________p1); \

> -})

> -

>  #endif /* __LINUX_COMPILER_H */

> diff --git a/include/linux/rculist.h b/include/linux/rculist.h

> index b1fd8bf85fdc..3a2bb7d8ed4d 100644

> --- a/include/linux/rculist.h

> +++ b/include/linux/rculist.h

> @@ -274,7 +274,7 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,

>   * primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().

>   */

>  #define list_entry_rcu(ptr, type, member) \

> -	container_of(lockless_dereference(ptr), type, member)

> +	container_of(READ_ONCE(ptr), type, member)

> 

>  /**

>   * Where are list_empty_rcu() and list_first_entry_rcu()?

> @@ -367,7 +367,7 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,

>   * example is when items are added to the list, but never deleted.

>   */

>  #define list_entry_lockless(ptr, type, member) \

> -	container_of((typeof(ptr))lockless_dereference(ptr), type, member)

> +	container_of((typeof(ptr))READ_ONCE(ptr), type, member)

> 

>  /**

>   * list_for_each_entry_lockless - iterate over rcu list of given type

> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h

> index de50d8a4cf41..380a3aeb09d7 100644

> --- a/include/linux/rcupdate.h

> +++ b/include/linux/rcupdate.h

> @@ -346,7 +346,7 @@ static inline void rcu_preempt_sleep_check(void) { }

>  #define __rcu_dereference_check(p, c, space) \

>  ({ \

>  	/* Dependency order vs. p above. */ \

> -	typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \

> +	typeof(*p) *________p1 = (typeof(*p) *__force)READ_ONCE(p); \

>  	RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_check() usage"); \

>  	rcu_dereference_sparse(p, space); \

>  	((typeof(*p) __force __kernel *)(________p1)); \

> @@ -360,7 +360,7 @@ static inline void rcu_preempt_sleep_check(void) { }

>  #define rcu_dereference_raw(p) \

>  ({ \

>  	/* Dependency order vs. p above. */ \

> -	typeof(p) ________p1 = lockless_dereference(p); \

> +	typeof(p) ________p1 = READ_ONCE(p); \

>  	((typeof(*p) __force __kernel *)(________p1)); \

>  })

> 

> diff --git a/kernel/events/core.c b/kernel/events/core.c

> index 6bc21e202ae4..417812ce0099 100644

> --- a/kernel/events/core.c

> +++ b/kernel/events/core.c

> @@ -4231,7 +4231,7 @@ static void perf_remove_from_owner(struct perf_event *event)

>  	 * indeed free this event, otherwise we need to serialize on

>  	 * owner->perf_event_mutex.

>  	 */

> -	owner = lockless_dereference(event->owner);

> +	owner = READ_ONCE(event->owner);

>  	if (owner) {

>  		/*

>  		 * Since delayed_put_task_struct() also drops the last

> @@ -4328,7 +4328,7 @@ int perf_event_release_kernel(struct perf_event *event)

>  		 * Cannot change, child events are not migrated, see the

>  		 * comment with perf_event_ctx_lock_nested().

>  		 */

> -		ctx = lockless_dereference(child->ctx);

> +		ctx = READ_ONCE(child->ctx);

>  		/*

>  		 * Since child_mutex nests inside ctx::mutex, we must jump

>  		 * through hoops. We start by grabbing a reference on the ctx.

> diff --git a/kernel/seccomp.c b/kernel/seccomp.c

> index bb3a38005b9c..1daa8b61a268 100644

> --- a/kernel/seccomp.c

> +++ b/kernel/seccomp.c

> @@ -189,7 +189,7 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd,

>  	u32 ret = SECCOMP_RET_ALLOW;

>  	/* Make sure cross-thread synced filter points somewhere sane. */

>  	struct seccomp_filter *f =

> -			lockless_dereference(current->seccomp.filter);

> +			READ_ONCE(current->seccomp.filter);

> 

>  	/* Ensure unexpected behavior doesn't result in failing open. */

>  	if (unlikely(WARN_ON(f == NULL)))

> diff --git a/kernel/task_work.c b/kernel/task_work.c

> index 836a72a66fba..9a9f262fc53d 100644

> --- a/kernel/task_work.c

> +++ b/kernel/task_work.c

> @@ -67,7 +67,7 @@ task_work_cancel(struct task_struct *task, task_work_func_t func)

>  	 * we raced with task_work_run(), *pprev == NULL/exited.

>  	 */

>  	raw_spin_lock_irqsave(&task->pi_lock, flags);

> -	while ((work = lockless_dereference(*pprev))) {

> +	while ((work = READ_ONCE(*pprev))) {

>  		if (work->func != func)

>  			pprev = &work->next;

>  		else if (cmpxchg(pprev, work, work->next) == work)

> diff --git a/mm/slab.h b/mm/slab.h

> index 073362816acc..8894f811a89d 100644

> --- a/mm/slab.h

> +++ b/mm/slab.h

> @@ -258,7 +258,7 @@ cache_from_memcg_idx(struct kmem_cache *s, int idx)

>  	 * memcg_caches issues a write barrier to match this (see

>  	 * memcg_create_kmem_cache()).

>  	 */

> -	cachep = lockless_dereference(arr->entries[idx]);

> +	cachep = READ_ONCE(arr->entries[idx]);

>  	rcu_read_unlock();

> 

>  	return cachep;

> -- 

> 2.1.4

>
diff mbox series

Patch

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index b759a60624fd..470a682f3fa4 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1886,18 +1886,6 @@  There are some more advanced barrier functions:
      See Documentation/atomic_{t,bitops}.txt for more information.
 
 
- (*) lockless_dereference();
-
-     This can be thought of as a pointer-fetch wrapper around the
-     smp_read_barrier_depends() data-dependency barrier.
-
-     This is also similar to rcu_dereference(), but in cases where
-     object lifetime is handled by some mechanism other than RCU, for
-     example, when the objects removed only when the system goes down.
-     In addition, lockless_dereference() is used in some data structures
-     that can be used both with and without RCU.
-
-
  (*) dma_wmb();
  (*) dma_rmb();
 
diff --git a/Documentation/translations/ko_KR/memory-barriers.txt b/Documentation/translations/ko_KR/memory-barriers.txt
index a7a813258013..ec3b46e27b7a 100644
--- a/Documentation/translations/ko_KR/memory-barriers.txt
+++ b/Documentation/translations/ko_KR/memory-barriers.txt
@@ -1858,18 +1858,6 @@  Mandatory 배리어들은 SMP 시스템에서도 UP 시스템에서도 SMP 효
      참고하세요.
 
 
- (*) lockless_dereference();
-
-     이 함수는 smp_read_barrier_depends() 데이터 의존성 배리어를 사용하는
-     포인터 읽어오기 래퍼(wrapper) 함수로 생각될 수 있습니다.
-
-     객체의 라이프타임이 RCU 외의 메커니즘으로 관리된다는 점을 제외하면
-     rcu_dereference() 와도 유사한데, 예를 들면 객체가 시스템이 꺼질 때에만
-     제거되는 경우 등입니다.  또한, lockless_dereference() 은 RCU 와 함께
-     사용될수도, RCU 없이 사용될 수도 있는 일부 데이터 구조에 사용되고
-     있습니다.
-
-
  (*) dma_wmb();
  (*) dma_rmb();
 
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 80534d3c2480..589af1eec7c1 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2371,7 +2371,7 @@  static unsigned long get_segment_base(unsigned int segment)
 		struct ldt_struct *ldt;
 
 		/* IRQs are off, so this synchronizes with smp_store_release */
-		ldt = lockless_dereference(current->active_mm->context.ldt);
+		ldt = READ_ONCE(current->active_mm->context.ldt);
 		if (!ldt || idx >= ldt->nr_entries)
 			return 0;
 
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index c120b5db178a..9037a4e546e8 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -72,8 +72,8 @@  static inline void load_mm_ldt(struct mm_struct *mm)
 #ifdef CONFIG_MODIFY_LDT_SYSCALL
 	struct ldt_struct *ldt;
 
-	/* lockless_dereference synchronizes with smp_store_release */
-	ldt = lockless_dereference(mm->context.ldt);
+	/* READ_ONCE synchronizes with smp_store_release */
+	ldt = READ_ONCE(mm->context.ldt);
 
 	/*
 	 * Any change to mm->context.ldt is followed by an IPI to all
diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
index f0e64db18ac8..0a21390642c4 100644
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -101,7 +101,7 @@  static void finalize_ldt_struct(struct ldt_struct *ldt)
 static void install_ldt(struct mm_struct *current_mm,
 			struct ldt_struct *ldt)
 {
-	/* Synchronizes with lockless_dereference in load_mm_ldt. */
+	/* Synchronizes with READ_ONCE in load_mm_ldt. */
 	smp_store_release(&current_mm->context.ldt, ldt);
 
 	/* Activate the LDT for all CPUs using current_mm. */
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 11f273d2f018..3f88c9d32f7e 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -366,7 +366,7 @@  static struct pgpath *choose_path_in_pg(struct multipath *m,
 
 	pgpath = path_to_pgpath(path);
 
-	if (unlikely(lockless_dereference(m->current_pg) != pg)) {
+	if (unlikely(READ_ONCE(m->current_pg) != pg)) {
 		/* Only update current_pgpath if pg changed */
 		spin_lock_irqsave(&m->lock, flags);
 		m->current_pgpath = pgpath;
@@ -390,7 +390,7 @@  static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes)
 	}
 
 	/* Were we instructed to switch PG? */
-	if (lockless_dereference(m->next_pg)) {
+	if (READ_ONCE(m->next_pg)) {
 		spin_lock_irqsave(&m->lock, flags);
 		pg = m->next_pg;
 		if (!pg) {
@@ -406,7 +406,7 @@  static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes)
 
 	/* Don't change PG until it has no remaining paths */
 check_current_pg:
-	pg = lockless_dereference(m->current_pg);
+	pg = READ_ONCE(m->current_pg);
 	if (pg) {
 		pgpath = choose_path_in_pg(m, pg, nr_bytes);
 		if (!IS_ERR_OR_NULL(pgpath))
@@ -473,7 +473,7 @@  static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
 	struct request *clone;
 
 	/* Do we need to select a new pgpath? */
-	pgpath = lockless_dereference(m->current_pgpath);
+	pgpath = READ_ONCE(m->current_pgpath);
 	if (!pgpath || !test_bit(MPATHF_QUEUE_IO, &m->flags))
 		pgpath = choose_pgpath(m, nr_bytes);
 
@@ -535,7 +535,7 @@  static int __multipath_map_bio(struct multipath *m, struct bio *bio, struct dm_m
 	bool queue_io;
 
 	/* Do we need to select a new pgpath? */
-	pgpath = lockless_dereference(m->current_pgpath);
+	pgpath = READ_ONCE(m->current_pgpath);
 	queue_io = test_bit(MPATHF_QUEUE_IO, &m->flags);
 	if (!pgpath || !queue_io)
 		pgpath = choose_pgpath(m, nr_bytes);
@@ -1804,7 +1804,7 @@  static int multipath_prepare_ioctl(struct dm_target *ti,
 	struct pgpath *current_pgpath;
 	int r;
 
-	current_pgpath = lockless_dereference(m->current_pgpath);
+	current_pgpath = READ_ONCE(m->current_pgpath);
 	if (!current_pgpath)
 		current_pgpath = choose_pgpath(m, 0);
 
@@ -1826,7 +1826,7 @@  static int multipath_prepare_ioctl(struct dm_target *ti,
 	}
 
 	if (r == -ENOTCONN) {
-		if (!lockless_dereference(m->current_pg)) {
+		if (!READ_ONCE(m->current_pg)) {
 			/* Path status changed, redo selection */
 			(void) choose_pgpath(m, 0);
 		}
@@ -1895,9 +1895,9 @@  static int multipath_busy(struct dm_target *ti)
 		return (m->queue_mode != DM_TYPE_MQ_REQUEST_BASED);
 
 	/* Guess which priority_group will be used at next mapping time */
-	pg = lockless_dereference(m->current_pg);
-	next_pg = lockless_dereference(m->next_pg);
-	if (unlikely(!lockless_dereference(m->current_pgpath) && next_pg))
+	pg = READ_ONCE(m->current_pg);
+	next_pg = READ_ONCE(m->next_pg);
+	if (unlikely(!READ_ONCE(m->current_pgpath) && next_pg))
 		pg = next_pg;
 
 	if (!pg) {
diff --git a/fs/dcache.c b/fs/dcache.c
index f90141387f01..34c852af215c 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -231,7 +231,7 @@  static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
 {
 	/*
 	 * Be careful about RCU walk racing with rename:
-	 * use 'lockless_dereference' to fetch the name pointer.
+	 * use 'READ_ONCE' to fetch the name pointer.
 	 *
 	 * NOTE! Even if a rename will mean that the length
 	 * was not loaded atomically, we don't care. The
@@ -245,7 +245,7 @@  static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
 	 * early because the data cannot match (there can
 	 * be no NUL in the ct/tcount data)
 	 */
-	const unsigned char *cs = lockless_dereference(dentry->d_name.name);
+	const unsigned char *cs = READ_ONCE(dentry->d_name.name);
 
 	return dentry_string_cmp(cs, ct, tcount);
 }
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 25d9b5adcd42..36b49bd09264 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -77,5 +77,5 @@  static inline struct ovl_inode *OVL_I(struct inode *inode)
 
 static inline struct dentry *ovl_upperdentry_dereference(struct ovl_inode *oi)
 {
-	return lockless_dereference(oi->__upperdentry);
+	return READ_ONCE(oi->__upperdentry);
 }
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 0f85ee9c3268..c67a7703296b 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -754,7 +754,7 @@  static int ovl_dir_fsync(struct file *file, loff_t start, loff_t end,
 	if (!od->is_upper && OVL_TYPE_UPPER(ovl_path_type(dentry))) {
 		struct inode *inode = file_inode(file);
 
-		realfile = lockless_dereference(od->upperfile);
+		realfile = READ_ONCE(od->upperfile);
 		if (!realfile) {
 			struct path upperpath;
 
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index c714239f453f..4517662f128c 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -242,6 +242,7 @@  static __always_inline void __write_once_size(volatile void *p, void *res, int s
 		__read_once_size(&(x), __u.__c, sizeof(x));		\
 	else								\
 		__read_once_size_nocheck(&(x), __u.__c, sizeof(x));	\
+	smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
 	__u.__val;							\
 })
 #define READ_ONCE(x) __READ_ONCE(x, 1)
@@ -345,24 +346,4 @@  static __always_inline void __write_once_size(volatile void *p, void *res, int s
 	(volatile typeof(x) *)&(x); })
 #define ACCESS_ONCE(x) (*__ACCESS_ONCE(x))
 
-/**
- * lockless_dereference() - safely load a pointer for later dereference
- * @p: The pointer to load
- *
- * Similar to rcu_dereference(), but for situations where the pointed-to
- * object's lifetime is managed by something other than RCU.  That
- * "something other" might be reference counting or simple immortality.
- *
- * The seemingly unused variable ___typecheck_p validates that @p is
- * indeed a pointer type by using a pointer to typeof(*p) as the type.
- * Taking a pointer to typeof(*p) again is needed in case p is void *.
- */
-#define lockless_dereference(p) \
-({ \
-	typeof(p) _________p1 = READ_ONCE(p); \
-	typeof(*(p)) *___typecheck_p __maybe_unused; \
-	smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
-	(_________p1); \
-})
-
 #endif /* __LINUX_COMPILER_H */
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index b1fd8bf85fdc..3a2bb7d8ed4d 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -274,7 +274,7 @@  static inline void list_splice_tail_init_rcu(struct list_head *list,
  * primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
  */
 #define list_entry_rcu(ptr, type, member) \
-	container_of(lockless_dereference(ptr), type, member)
+	container_of(READ_ONCE(ptr), type, member)
 
 /**
  * Where are list_empty_rcu() and list_first_entry_rcu()?
@@ -367,7 +367,7 @@  static inline void list_splice_tail_init_rcu(struct list_head *list,
  * example is when items are added to the list, but never deleted.
  */
 #define list_entry_lockless(ptr, type, member) \
-	container_of((typeof(ptr))lockless_dereference(ptr), type, member)
+	container_of((typeof(ptr))READ_ONCE(ptr), type, member)
 
 /**
  * list_for_each_entry_lockless - iterate over rcu list of given type
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index de50d8a4cf41..380a3aeb09d7 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -346,7 +346,7 @@  static inline void rcu_preempt_sleep_check(void) { }
 #define __rcu_dereference_check(p, c, space) \
 ({ \
 	/* Dependency order vs. p above. */ \
-	typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \
+	typeof(*p) *________p1 = (typeof(*p) *__force)READ_ONCE(p); \
 	RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_check() usage"); \
 	rcu_dereference_sparse(p, space); \
 	((typeof(*p) __force __kernel *)(________p1)); \
@@ -360,7 +360,7 @@  static inline void rcu_preempt_sleep_check(void) { }
 #define rcu_dereference_raw(p) \
 ({ \
 	/* Dependency order vs. p above. */ \
-	typeof(p) ________p1 = lockless_dereference(p); \
+	typeof(p) ________p1 = READ_ONCE(p); \
 	((typeof(*p) __force __kernel *)(________p1)); \
 })
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 6bc21e202ae4..417812ce0099 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4231,7 +4231,7 @@  static void perf_remove_from_owner(struct perf_event *event)
 	 * indeed free this event, otherwise we need to serialize on
 	 * owner->perf_event_mutex.
 	 */
-	owner = lockless_dereference(event->owner);
+	owner = READ_ONCE(event->owner);
 	if (owner) {
 		/*
 		 * Since delayed_put_task_struct() also drops the last
@@ -4328,7 +4328,7 @@  int perf_event_release_kernel(struct perf_event *event)
 		 * Cannot change, child events are not migrated, see the
 		 * comment with perf_event_ctx_lock_nested().
 		 */
-		ctx = lockless_dereference(child->ctx);
+		ctx = READ_ONCE(child->ctx);
 		/*
 		 * Since child_mutex nests inside ctx::mutex, we must jump
 		 * through hoops. We start by grabbing a reference on the ctx.
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index bb3a38005b9c..1daa8b61a268 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -189,7 +189,7 @@  static u32 seccomp_run_filters(const struct seccomp_data *sd,
 	u32 ret = SECCOMP_RET_ALLOW;
 	/* Make sure cross-thread synced filter points somewhere sane. */
 	struct seccomp_filter *f =
-			lockless_dereference(current->seccomp.filter);
+			READ_ONCE(current->seccomp.filter);
 
 	/* Ensure unexpected behavior doesn't result in failing open. */
 	if (unlikely(WARN_ON(f == NULL)))
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 836a72a66fba..9a9f262fc53d 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -67,7 +67,7 @@  task_work_cancel(struct task_struct *task, task_work_func_t func)
 	 * we raced with task_work_run(), *pprev == NULL/exited.
 	 */
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
-	while ((work = lockless_dereference(*pprev))) {
+	while ((work = READ_ONCE(*pprev))) {
 		if (work->func != func)
 			pprev = &work->next;
 		else if (cmpxchg(pprev, work, work->next) == work)
diff --git a/mm/slab.h b/mm/slab.h
index 073362816acc..8894f811a89d 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -258,7 +258,7 @@  cache_from_memcg_idx(struct kmem_cache *s, int idx)
 	 * memcg_caches issues a write barrier to match this (see
 	 * memcg_create_kmem_cache()).
 	 */
-	cachep = lockless_dereference(arr->entries[idx]);
+	cachep = READ_ONCE(arr->entries[idx]);
 	rcu_read_unlock();
 
 	return cachep;