Message ID | 20210121205119.793296-1-mic@digikod.net |
---|---|
Headers | show |
Series | Landlock LSM | expand |
Hello Mickaël, It would be great to have some manual pages for these system calls before release... Can you prepare something? Thanks, Michael On Thu, 21 Jan 2021 at 21:51, Mickaël Salaün <mic@digikod.net> wrote: > > From: Mickaël Salaün <mic@linux.microsoft.com> > > This documentation can be built with the Sphinx framework. > > Cc: James Morris <jmorris@namei.org> > Cc: Jann Horn <jannh@google.com> > Cc: Kees Cook <keescook@chromium.org> > Cc: Serge E. Hallyn <serge@hallyn.com> > Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> > Reviewed-by: Vincent Dagonneau <vincent.dagonneau@ssi.gouv.fr> > --- > > Changes since v25: > * Explain the behavior of layered access rights. > * Explain how bind mounts and overayfs mounts are handled by Landlock: > merged overlayfs mount points have their own inodes, which makes these > hierarchies independent from its upper and lower layers, unlike bind > mounts which share the same inodes between the source hierarchy and > the mount point hierarchy. > New overlayfs mount and bind mount tests check these behaviors. > * Synchronize with the new syscalls.c file and update syscall names. > * Fix spelling. > * Remove Reviewed-by Jann Horn because of the above changes. > > Changes since v24: > * Add Reviewed-by Jann Horn. > * Add a paragraph to explain how the ruleset layers work. > * Bump date. > > Changes since v23: > * Explain limitations for the maximum number of stacked ruleset, and the > memory usage restrictions. > > Changes since v22: > * Fix spelling and remove obsolete sentence (spotted by Jann Horn). > * Bump date. > > Changes since v21: > * Move the user space documentation to userspace-api/landlock.rst and > the kernel documentation to security/landlock.rst . > * Add license headers. > * Add last update dates. > * Update MAINTAINERS file. > * Add (back) links to git.kernel.org . > * Fix spelling. > > Changes since v20: > * Update examples and documentation with the new syscalls. > > Changes since v19: > * Update examples and documentation with the new syscalls. > > Changes since v15: > * Add current limitations. > > Changes since v14: > * Fix spelling (contributed by Randy Dunlap). > * Extend documentation about inheritance and explain layer levels. > * Remove the use of now-removed access rights. > * Use GitHub links. > * Improve kernel documentation. > * Add section for tests. > * Update example. > > Changes since v13: > * Rewrote the documentation according to the major revamp. > > Previous changes: > https://lore.kernel.org/lkml/20191104172146.30797-8-mic@digikod.net/ > --- > Documentation/security/index.rst | 1 + > Documentation/security/landlock.rst | 79 ++++++ > Documentation/userspace-api/index.rst | 1 + > Documentation/userspace-api/landlock.rst | 306 +++++++++++++++++++++++ > MAINTAINERS | 2 + > 5 files changed, 389 insertions(+) > create mode 100644 Documentation/security/landlock.rst > create mode 100644 Documentation/userspace-api/landlock.rst > > diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst > index 8129405eb2cc..16335de04e8c 100644 > --- a/Documentation/security/index.rst > +++ b/Documentation/security/index.rst > @@ -16,3 +16,4 @@ Security Documentation > siphash > tpm/index > digsig > + landlock > diff --git a/Documentation/security/landlock.rst b/Documentation/security/landlock.rst > new file mode 100644 > index 000000000000..244e616d3d7a > --- /dev/null > +++ b/Documentation/security/landlock.rst > @@ -0,0 +1,79 @@ > +.. SPDX-License-Identifier: GPL-2.0 > +.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> > +.. Copyright © 2019-2020 ANSSI > + > +================================== > +Landlock LSM: kernel documentation > +================================== > + > +:Author: Mickaël Salaün > +:Date: January 2021 > + > +Landlock's goal is to create scoped access-control (i.e. sandboxing). To > +harden a whole system, this feature should be available to any process, > +including unprivileged ones. Because such process may be compromised or > +backdoored (i.e. untrusted), Landlock's features must be safe to use from the > +kernel and other processes point of view. Landlock's interface must therefore > +expose a minimal attack surface. > + > +Landlock is designed to be usable by unprivileged processes while following the > +system security policy enforced by other access control mechanisms (e.g. DAC, > +LSM). Indeed, a Landlock rule shall not interfere with other access-controls > +enforced on the system, only add more restrictions. > + > +Any user can enforce Landlock rulesets on their processes. They are merged and > +evaluated according to the inherited ones in a way that ensures that only more > +constraints can be added. > + > +User space documentation can be found here: :doc:`/userspace-api/landlock`. > + > +Guiding principles for safe access controls > +=========================================== > + > +* A Landlock rule shall be focused on access control on kernel objects instead > + of syscall filtering (i.e. syscall arguments), which is the purpose of > + seccomp-bpf. > +* To avoid multiple kinds of side-channel attacks (e.g. leak of security > + policies, CPU-based attacks), Landlock rules shall not be able to > + programmatically communicate with user space. > +* Kernel access check shall not slow down access request from unsandboxed > + processes. > +* Computation related to Landlock operations (e.g. enforcing a ruleset) shall > + only impact the processes requesting them. > + > +Tests > +===== > + > +Userspace tests for backward compatibility, ptrace restrictions and filesystem > +support can be found here: `tools/testing/selftests/landlock/`_. > + > +Kernel structures > +================= > + > +Object > +------ > + > +.. kernel-doc:: security/landlock/object.h > + :identifiers: > + > +Ruleset and domain > +------------------ > + > +A domain is a read-only ruleset tied to a set of subjects (i.e. tasks' > +credentials). Each time a ruleset is enforced on a task, the current domain is > +duplicated and the ruleset is imported as a new layer of rules in the new > +domain. Indeed, once in a domain, each rule is tied to a layer level. To > +grant access to an object, at least one rule of each layer must allow the > +requested action on the object. A task can then only transit to a new domain > +that is the intersection of the constraints from the current domain and those > +of a ruleset provided by the task. > + > +The definition of a subject is implicit for a task sandboxing itself, which > +makes the reasoning much easier and helps avoid pitfalls. > + > +.. kernel-doc:: security/landlock/ruleset.h > + :identifiers: > + > +.. Links > +.. _tools/testing/selftests/landlock/: > + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/landlock/ > diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst > index acd2cc2a538d..01f1748ab569 100644 > --- a/Documentation/userspace-api/index.rst > +++ b/Documentation/userspace-api/index.rst > @@ -18,6 +18,7 @@ place where this information is gathered. > > no_new_privs > seccomp_filter > + landlock > unshare > spec_ctrl > accelerators/ocxl > diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst > new file mode 100644 > index 000000000000..06c16f2b038b > --- /dev/null > +++ b/Documentation/userspace-api/landlock.rst > @@ -0,0 +1,306 @@ > +.. SPDX-License-Identifier: GPL-2.0 > +.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> > +.. Copyright © 2019-2020 ANSSI > + > +===================================== > +Landlock: unprivileged access control > +===================================== > + > +:Author: Mickaël Salaün > +:Date: January 2021 > + > +The goal of Landlock is to enable to restrict ambient rights (e.g. global > +filesystem access) for a set of processes. Because Landlock is a stackable > +LSM, it makes possible to create safe security sandboxes as new security layers > +in addition to the existing system-wide access-controls. This kind of sandbox > +is expected to help mitigate the security impact of bugs or > +unexpected/malicious behaviors in user space applications. Landlock empowers > +any process, including unprivileged ones, to securely restrict themselves. > + > +Landlock rules > +============== > + > +A Landlock rule describes an action on an object. An object is currently a > +file hierarchy, and the related filesystem actions are defined in `Access > +rights`_. A set of rules is aggregated in a ruleset, which can then restrict > +the thread enforcing it, and its future children. > + > +Defining and enforcing a security policy > +---------------------------------------- > + > +We first need to create the ruleset that will contain our rules. For this > +example, the ruleset will contain rules that only allow read actions, but write > +actions will be denied. The ruleset then needs to handle both of these kind of > +actions. > + > +.. code-block:: c > + > + int ruleset_fd; > + struct landlock_ruleset_attr ruleset_attr = { > + .handled_access_fs = > + LANDLOCK_ACCESS_FS_EXECUTE | > + LANDLOCK_ACCESS_FS_WRITE_FILE | > + LANDLOCK_ACCESS_FS_READ_FILE | > + LANDLOCK_ACCESS_FS_READ_DIR | > + LANDLOCK_ACCESS_FS_REMOVE_DIR | > + LANDLOCK_ACCESS_FS_REMOVE_FILE | > + LANDLOCK_ACCESS_FS_MAKE_CHAR | > + LANDLOCK_ACCESS_FS_MAKE_DIR | > + LANDLOCK_ACCESS_FS_MAKE_REG | > + LANDLOCK_ACCESS_FS_MAKE_SOCK | > + LANDLOCK_ACCESS_FS_MAKE_FIFO | > + LANDLOCK_ACCESS_FS_MAKE_BLOCK | > + LANDLOCK_ACCESS_FS_MAKE_SYM, > + }; > + > + ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0); > + if (ruleset_fd < 0) { > + perror("Failed to create a ruleset"); > + return 1; > + } > + > +We can now add a new rule to this ruleset thanks to the returned file > +descriptor referring to this ruleset. The rule will only allow reading the > +file hierarchy ``/usr``. Without another rule, write actions would then be > +denied by the ruleset. To add ``/usr`` to the ruleset, we open it with the > +``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file > +descriptor. > + > +.. code-block:: c > + > + int err; > + struct landlock_path_beneath_attr path_beneath = { > + .allowed_access = > + LANDLOCK_ACCESS_FS_EXECUTE | > + LANDLOCK_ACCESS_FS_READ_FILE | > + LANDLOCK_ACCESS_FS_READ_DIR, > + }; > + > + path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC); > + if (path_beneath.parent_fd < 0) { > + perror("Failed to open file"); > + close(ruleset_fd); > + return 1; > + } > + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH, > + &path_beneath, 0); > + close(path_beneath.parent_fd); > + if (err) { > + perror("Failed to update ruleset"); > + close(ruleset_fd); > + return 1; > + } > + > +We now have a ruleset with one rule allowing read access to ``/usr`` while > +denying all other handled accesses for the filesystem. The next step is to > +restrict the current thread from gaining more privileges (e.g. thanks to a SUID > +binary). > + > +.. code-block:: c > + > + if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) { > + perror("Failed to restrict privileges"); > + close(ruleset_fd); > + return 1; > + } > + > +The current thread is now ready to sandbox itself with the ruleset. > + > +.. code-block:: c > + > + if (landlock_enforce_ruleset_self(ruleset_fd, 0)) { > + perror("Failed to enforce ruleset"); > + close(ruleset_fd); > + return 1; > + } > + close(ruleset_fd); > + > +If the `landlock_enforce_ruleset_self` system call succeeds, the current thread > +is now restricted and this policy will be enforced on all its subsequently > +created children as well. Once a thread is landlocked, there is no way to > +remove its security policy; only adding more restrictions is allowed. These > +threads are now in a new Landlock domain, merge of their parent one (if any) > +with the new ruleset. > + > +Full working code can be found in `samples/landlock/sandboxer.c`_. > + > +Layers of file path access rights > +--------------------------------- > + > +Each time a thread enforces a ruleset on itself, it updates its Landlock domain > +with a new layer of policy. Indeed, this complementary policy is stacked with > +the potentially other rulesets already restricting this thread. A sandboxed > +thread can then safely add more constraints to itself with a new enforced > +ruleset. > + > +One policy layer grants access to a file path if at least one of its rules > +encountered on the path grants the access. A sandboxed thread can only access > +a file path if all its enforced policy layers grant the access as well as all > +the other system access controls (e.g. filesystem DAC, other LSM policies, > +etc.). > + > +Bind mounts and OverlayFS > +------------------------- > + > +Landlock enables to restrict access to file hierarchies, which means that these > +access rights can be propagated with bind mounts (cf. > +:doc:`/filesystems/sharedsubtree`) but not with :doc:`/filesystems/overlayfs`. > + > +A bind mount mirrors a source file hierarchy to a destination. The destination > +hierarchy is then composed of the exact same files, on which Landlock rules can > +be tied, either via the source or the destination path. These rules restrict > +access when they are encountered on a path, which means that they can restrict > +access to multiple file hierarchies at the same time, whether these hierarchies > +are the result of bind mounts or not. > + > +An OverlayFS mount point consists of upper and lower layers. These layers are > +combined in a merge directory, result of the mount point. This merge hierarchy > +may include files from the upper and lower layers, but modifications performed > +on the merge hierarchy only reflects on the upper layer. From a Landlock > +policy point of view, each OverlayFS layers and merge hierarchies are > +standalone and contains their own set of files and directories, which is > +different from bind mounts. A policy restricting an OverlayFS layer will not > +restrict the resulted merged hierarchy, and vice versa. > + > +Inheritance > +----------- > + > +Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain > +restrictions from its parent. This is similar to the seccomp inheritance (cf. > +:doc:`/userspace-api/seccomp_filter`) or any other LSM dealing with task's > +:manpage:`credentials(7)`. For instance, one process's thread may apply > +Landlock rules to itself, but they will not be automatically applied to other > +sibling threads (unlike POSIX thread credential changes, cf. > +:manpage:`nptl(7)`). > + > +When a thread sandboxes itself, we have the guarantee that the related security > +policy will stay enforced on all this thread's descendants. This allows > +creating standalone and modular security policies per application, which will > +automatically be composed between themselves according to their runtime parent > +policies. > + > +Ptrace restrictions > +------------------- > + > +A sandboxed process has less privileges than a non-sandboxed process and must > +then be subject to additional restrictions when manipulating another process. > +To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target > +process, a sandboxed process should have a subset of the target process rules, > +which means the tracee must be in a sub-domain of the tracer. > + > +Kernel interface > +================ > + > +Access rights > +------------- > + > +.. kernel-doc:: include/uapi/linux/landlock.h > + :identifiers: fs_access > + > +Creating a new ruleset > +---------------------- > + > +.. kernel-doc:: security/landlock/syscalls.c > + :identifiers: sys_landlock_create_ruleset > + > +.. kernel-doc:: include/uapi/linux/landlock.h > + :identifiers: landlock_ruleset_attr > + > +Extending a ruleset > +------------------- > + > +.. kernel-doc:: security/landlock/syscalls.c > + :identifiers: sys_landlock_add_rule > + > +.. kernel-doc:: include/uapi/linux/landlock.h > + :identifiers: landlock_rule_type landlock_path_beneath_attr > + > +Enforcing a ruleset > +------------------- > + > +.. kernel-doc:: security/landlock/syscalls.c > + :identifiers: sys_landlock_enforce_ruleset_self > + > +Current limitations > +=================== > + > +Ruleset layers > +-------------- > + > +There is a limit of 64 layers of stacked rulesets. This can be an issue for a > +task willing to enforce a new ruleset in complement to its 64 inherited > +rulesets. Once this limit is reached, sys_landlock_enforce_ruleset_self() > +returns E2BIG. It is then strongly suggested to carefully build rulesets once > +in the life of a thread, especially for applications able to launch other > +applications that may also want to sandbox themselves (e.g. shells, container > +managers, etc.). > + > +Memory usage > +------------ > + > +Kernel memory allocated to create rulesets is accounted and can be restricted > +by the :doc:`/admin-guide/cgroup-v1/memory`. > + > +File renaming and linking > +------------------------- > + > +Because Landlock targets unprivileged access controls, it is needed to properly > +handle composition of rules. Such property also implies rules nesting. > +Properly handling multiple layers of ruleset, each one of them able to restrict > +access to files, also implies to inherit the ruleset restrictions from a parent > +to its hierarchy. Because files are identified and restricted by their > +hierarchy, moving or linking a file from one directory to another implies to > +propagate the hierarchy constraints. To protect against privilege escalations > +through renaming or linking, and for the sack of simplicity, Landlock currently > +limits linking and renaming to the same directory. Future Landlock evolutions > +will enable more flexibility for renaming and linking, with dedicated ruleset > +flags. > + > +Filesystem layout modification > +------------------------------ > + > +As for file renaming and linking, a sandboxed thread cannot modify its > +filesystem layout, whether via :manpage:`mount(2)` or :manpage:`pivot_root(2)`. > +However, :manpage:`chroot(2)` calls are not denied. > + > +Special filesystems > +------------------- > + > +Access to regular files and directories can be restricted by Landlock, > +according to the handled accesses of a ruleset. However, files that do not > +come from a user-visible filesystem (e.g. pipe, socket), but can still be > +accessed through /proc/self/fd/, cannot currently be restricted. Likewise, > +some special kernel filesystems such as nsfs, which can be accessed through > +/proc/self/ns/, cannot currently be restricted. For now, these kind of special > +paths are then always allowed. Future Landlock evolutions will enable to > +restrict such paths with dedicated ruleset flags. > + > +Questions and answers > +===================== > + > +What about user space sandbox managers? > +--------------------------------------- > + > +Using user space process to enforce restrictions on kernel resources can lead > +to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of > +the OS code and state > +<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_). > + > +What about namespaces and containers? > +------------------------------------- > + > +Namespaces can help create sandboxes but they are not designed for > +access-control and then miss useful features for such use case (e.g. no > +fine-grained restrictions). Moreover, their complexity can lead to security > +issues, especially when untrusted processes can manipulate them (cf. > +`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_). > + > +Additional documentation > +======================== > + > +* :doc:`/security/landlock` > +* https://landlock.io > + > +.. Links > +.. _samples/landlock/sandboxer.c: > + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c > diff --git a/MAINTAINERS b/MAINTAINERS > index 572e4288c60f..68276c73c33e 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -9942,6 +9942,8 @@ L: linux-security-module@vger.kernel.org > S: Supported > W: https://landlock.io > T: git https://github.com/landlock-lsm/linux.git > +F: Documentation/security/landlock.rst > +F: Documentation/userspace-api/landlock.rst > F: include/uapi/linux/landlock.h > F: security/landlock/ > K: landlock > -- > 2.30.0 > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/
On 22/01/2021 09:33, Michael Kerrisk (man-pages) wrote: > Hello Mickaël, Hi Michael, > > It would be great to have some manual pages for these system calls > before release... Can you prepare something? Yes, I will start some pages based on the current documentation, but it will be some time before seeing Landlock in a (stable) release. Could you please take a look at the UAPI (patch 8/12)? > > Thanks, > > Michael > > On Thu, 21 Jan 2021 at 21:51, Mickaël Salaün <mic@digikod.net> wrote: >> >> From: Mickaël Salaün <mic@linux.microsoft.com> >> >> This documentation can be built with the Sphinx framework. >> >> Cc: James Morris <jmorris@namei.org> >> Cc: Jann Horn <jannh@google.com> >> Cc: Kees Cook <keescook@chromium.org> >> Cc: Serge E. Hallyn <serge@hallyn.com> >> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> >> Reviewed-by: Vincent Dagonneau <vincent.dagonneau@ssi.gouv.fr> >> --- >> >> Changes since v25: >> * Explain the behavior of layered access rights. >> * Explain how bind mounts and overayfs mounts are handled by Landlock: >> merged overlayfs mount points have their own inodes, which makes these >> hierarchies independent from its upper and lower layers, unlike bind >> mounts which share the same inodes between the source hierarchy and >> the mount point hierarchy. >> New overlayfs mount and bind mount tests check these behaviors. >> * Synchronize with the new syscalls.c file and update syscall names. >> * Fix spelling. >> * Remove Reviewed-by Jann Horn because of the above changes. >> >> Changes since v24: >> * Add Reviewed-by Jann Horn. >> * Add a paragraph to explain how the ruleset layers work. >> * Bump date. >> >> Changes since v23: >> * Explain limitations for the maximum number of stacked ruleset, and the >> memory usage restrictions. >> >> Changes since v22: >> * Fix spelling and remove obsolete sentence (spotted by Jann Horn). >> * Bump date. >> >> Changes since v21: >> * Move the user space documentation to userspace-api/landlock.rst and >> the kernel documentation to security/landlock.rst . >> * Add license headers. >> * Add last update dates. >> * Update MAINTAINERS file. >> * Add (back) links to git.kernel.org . >> * Fix spelling. >> >> Changes since v20: >> * Update examples and documentation with the new syscalls. >> >> Changes since v19: >> * Update examples and documentation with the new syscalls. >> >> Changes since v15: >> * Add current limitations. >> >> Changes since v14: >> * Fix spelling (contributed by Randy Dunlap). >> * Extend documentation about inheritance and explain layer levels. >> * Remove the use of now-removed access rights. >> * Use GitHub links. >> * Improve kernel documentation. >> * Add section for tests. >> * Update example. >> >> Changes since v13: >> * Rewrote the documentation according to the major revamp. >> >> Previous changes: >> https://lore.kernel.org/lkml/20191104172146.30797-8-mic@digikod.net/ >> --- >> Documentation/security/index.rst | 1 + >> Documentation/security/landlock.rst | 79 ++++++ >> Documentation/userspace-api/index.rst | 1 + >> Documentation/userspace-api/landlock.rst | 306 +++++++++++++++++++++++ >> MAINTAINERS | 2 + >> 5 files changed, 389 insertions(+) >> create mode 100644 Documentation/security/landlock.rst >> create mode 100644 Documentation/userspace-api/landlock.rst >> >> diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst >> index 8129405eb2cc..16335de04e8c 100644 >> --- a/Documentation/security/index.rst >> +++ b/Documentation/security/index.rst >> @@ -16,3 +16,4 @@ Security Documentation >> siphash >> tpm/index >> digsig >> + landlock >> diff --git a/Documentation/security/landlock.rst b/Documentation/security/landlock.rst >> new file mode 100644 >> index 000000000000..244e616d3d7a >> --- /dev/null >> +++ b/Documentation/security/landlock.rst >> @@ -0,0 +1,79 @@ >> +.. SPDX-License-Identifier: GPL-2.0 >> +.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> >> +.. Copyright © 2019-2020 ANSSI >> + >> +================================== >> +Landlock LSM: kernel documentation >> +================================== >> + >> +:Author: Mickaël Salaün >> +:Date: January 2021 >> + >> +Landlock's goal is to create scoped access-control (i.e. sandboxing). To >> +harden a whole system, this feature should be available to any process, >> +including unprivileged ones. Because such process may be compromised or >> +backdoored (i.e. untrusted), Landlock's features must be safe to use from the >> +kernel and other processes point of view. Landlock's interface must therefore >> +expose a minimal attack surface. >> + >> +Landlock is designed to be usable by unprivileged processes while following the >> +system security policy enforced by other access control mechanisms (e.g. DAC, >> +LSM). Indeed, a Landlock rule shall not interfere with other access-controls >> +enforced on the system, only add more restrictions. >> + >> +Any user can enforce Landlock rulesets on their processes. They are merged and >> +evaluated according to the inherited ones in a way that ensures that only more >> +constraints can be added. >> + >> +User space documentation can be found here: :doc:`/userspace-api/landlock`. >> + >> +Guiding principles for safe access controls >> +=========================================== >> + >> +* A Landlock rule shall be focused on access control on kernel objects instead >> + of syscall filtering (i.e. syscall arguments), which is the purpose of >> + seccomp-bpf. >> +* To avoid multiple kinds of side-channel attacks (e.g. leak of security >> + policies, CPU-based attacks), Landlock rules shall not be able to >> + programmatically communicate with user space. >> +* Kernel access check shall not slow down access request from unsandboxed >> + processes. >> +* Computation related to Landlock operations (e.g. enforcing a ruleset) shall >> + only impact the processes requesting them. >> + >> +Tests >> +===== >> + >> +Userspace tests for backward compatibility, ptrace restrictions and filesystem >> +support can be found here: `tools/testing/selftests/landlock/`_. >> + >> +Kernel structures >> +================= >> + >> +Object >> +------ >> + >> +.. kernel-doc:: security/landlock/object.h >> + :identifiers: >> + >> +Ruleset and domain >> +------------------ >> + >> +A domain is a read-only ruleset tied to a set of subjects (i.e. tasks' >> +credentials). Each time a ruleset is enforced on a task, the current domain is >> +duplicated and the ruleset is imported as a new layer of rules in the new >> +domain. Indeed, once in a domain, each rule is tied to a layer level. To >> +grant access to an object, at least one rule of each layer must allow the >> +requested action on the object. A task can then only transit to a new domain >> +that is the intersection of the constraints from the current domain and those >> +of a ruleset provided by the task. >> + >> +The definition of a subject is implicit for a task sandboxing itself, which >> +makes the reasoning much easier and helps avoid pitfalls. >> + >> +.. kernel-doc:: security/landlock/ruleset.h >> + :identifiers: >> + >> +.. Links >> +.. _tools/testing/selftests/landlock/: >> + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/landlock/ >> diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst >> index acd2cc2a538d..01f1748ab569 100644 >> --- a/Documentation/userspace-api/index.rst >> +++ b/Documentation/userspace-api/index.rst >> @@ -18,6 +18,7 @@ place where this information is gathered. >> >> no_new_privs >> seccomp_filter >> + landlock >> unshare >> spec_ctrl >> accelerators/ocxl >> diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst >> new file mode 100644 >> index 000000000000..06c16f2b038b >> --- /dev/null >> +++ b/Documentation/userspace-api/landlock.rst >> @@ -0,0 +1,306 @@ >> +.. SPDX-License-Identifier: GPL-2.0 >> +.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> >> +.. Copyright © 2019-2020 ANSSI >> + >> +===================================== >> +Landlock: unprivileged access control >> +===================================== >> + >> +:Author: Mickaël Salaün >> +:Date: January 2021 >> + >> +The goal of Landlock is to enable to restrict ambient rights (e.g. global >> +filesystem access) for a set of processes. Because Landlock is a stackable >> +LSM, it makes possible to create safe security sandboxes as new security layers >> +in addition to the existing system-wide access-controls. This kind of sandbox >> +is expected to help mitigate the security impact of bugs or >> +unexpected/malicious behaviors in user space applications. Landlock empowers >> +any process, including unprivileged ones, to securely restrict themselves. >> + >> +Landlock rules >> +============== >> + >> +A Landlock rule describes an action on an object. An object is currently a >> +file hierarchy, and the related filesystem actions are defined in `Access >> +rights`_. A set of rules is aggregated in a ruleset, which can then restrict >> +the thread enforcing it, and its future children. >> + >> +Defining and enforcing a security policy >> +---------------------------------------- >> + >> +We first need to create the ruleset that will contain our rules. For this >> +example, the ruleset will contain rules that only allow read actions, but write >> +actions will be denied. The ruleset then needs to handle both of these kind of >> +actions. >> + >> +.. code-block:: c >> + >> + int ruleset_fd; >> + struct landlock_ruleset_attr ruleset_attr = { >> + .handled_access_fs = >> + LANDLOCK_ACCESS_FS_EXECUTE | >> + LANDLOCK_ACCESS_FS_WRITE_FILE | >> + LANDLOCK_ACCESS_FS_READ_FILE | >> + LANDLOCK_ACCESS_FS_READ_DIR | >> + LANDLOCK_ACCESS_FS_REMOVE_DIR | >> + LANDLOCK_ACCESS_FS_REMOVE_FILE | >> + LANDLOCK_ACCESS_FS_MAKE_CHAR | >> + LANDLOCK_ACCESS_FS_MAKE_DIR | >> + LANDLOCK_ACCESS_FS_MAKE_REG | >> + LANDLOCK_ACCESS_FS_MAKE_SOCK | >> + LANDLOCK_ACCESS_FS_MAKE_FIFO | >> + LANDLOCK_ACCESS_FS_MAKE_BLOCK | >> + LANDLOCK_ACCESS_FS_MAKE_SYM, >> + }; >> + >> + ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0); >> + if (ruleset_fd < 0) { >> + perror("Failed to create a ruleset"); >> + return 1; >> + } >> + >> +We can now add a new rule to this ruleset thanks to the returned file >> +descriptor referring to this ruleset. The rule will only allow reading the >> +file hierarchy ``/usr``. Without another rule, write actions would then be >> +denied by the ruleset. To add ``/usr`` to the ruleset, we open it with the >> +``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file >> +descriptor. >> + >> +.. code-block:: c >> + >> + int err; >> + struct landlock_path_beneath_attr path_beneath = { >> + .allowed_access = >> + LANDLOCK_ACCESS_FS_EXECUTE | >> + LANDLOCK_ACCESS_FS_READ_FILE | >> + LANDLOCK_ACCESS_FS_READ_DIR, >> + }; >> + >> + path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC); >> + if (path_beneath.parent_fd < 0) { >> + perror("Failed to open file"); >> + close(ruleset_fd); >> + return 1; >> + } >> + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH, >> + &path_beneath, 0); >> + close(path_beneath.parent_fd); >> + if (err) { >> + perror("Failed to update ruleset"); >> + close(ruleset_fd); >> + return 1; >> + } >> + >> +We now have a ruleset with one rule allowing read access to ``/usr`` while >> +denying all other handled accesses for the filesystem. The next step is to >> +restrict the current thread from gaining more privileges (e.g. thanks to a SUID >> +binary). >> + >> +.. code-block:: c >> + >> + if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) { >> + perror("Failed to restrict privileges"); >> + close(ruleset_fd); >> + return 1; >> + } >> + >> +The current thread is now ready to sandbox itself with the ruleset. >> + >> +.. code-block:: c >> + >> + if (landlock_enforce_ruleset_self(ruleset_fd, 0)) { >> + perror("Failed to enforce ruleset"); >> + close(ruleset_fd); >> + return 1; >> + } >> + close(ruleset_fd); >> + >> +If the `landlock_enforce_ruleset_self` system call succeeds, the current thread >> +is now restricted and this policy will be enforced on all its subsequently >> +created children as well. Once a thread is landlocked, there is no way to >> +remove its security policy; only adding more restrictions is allowed. These >> +threads are now in a new Landlock domain, merge of their parent one (if any) >> +with the new ruleset. >> + >> +Full working code can be found in `samples/landlock/sandboxer.c`_. >> + >> +Layers of file path access rights >> +--------------------------------- >> + >> +Each time a thread enforces a ruleset on itself, it updates its Landlock domain >> +with a new layer of policy. Indeed, this complementary policy is stacked with >> +the potentially other rulesets already restricting this thread. A sandboxed >> +thread can then safely add more constraints to itself with a new enforced >> +ruleset. >> + >> +One policy layer grants access to a file path if at least one of its rules >> +encountered on the path grants the access. A sandboxed thread can only access >> +a file path if all its enforced policy layers grant the access as well as all >> +the other system access controls (e.g. filesystem DAC, other LSM policies, >> +etc.). >> + >> +Bind mounts and OverlayFS >> +------------------------- >> + >> +Landlock enables to restrict access to file hierarchies, which means that these >> +access rights can be propagated with bind mounts (cf. >> +:doc:`/filesystems/sharedsubtree`) but not with :doc:`/filesystems/overlayfs`. >> + >> +A bind mount mirrors a source file hierarchy to a destination. The destination >> +hierarchy is then composed of the exact same files, on which Landlock rules can >> +be tied, either via the source or the destination path. These rules restrict >> +access when they are encountered on a path, which means that they can restrict >> +access to multiple file hierarchies at the same time, whether these hierarchies >> +are the result of bind mounts or not. >> + >> +An OverlayFS mount point consists of upper and lower layers. These layers are >> +combined in a merge directory, result of the mount point. This merge hierarchy >> +may include files from the upper and lower layers, but modifications performed >> +on the merge hierarchy only reflects on the upper layer. From a Landlock >> +policy point of view, each OverlayFS layers and merge hierarchies are >> +standalone and contains their own set of files and directories, which is >> +different from bind mounts. A policy restricting an OverlayFS layer will not >> +restrict the resulted merged hierarchy, and vice versa. >> + >> +Inheritance >> +----------- >> + >> +Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain >> +restrictions from its parent. This is similar to the seccomp inheritance (cf. >> +:doc:`/userspace-api/seccomp_filter`) or any other LSM dealing with task's >> +:manpage:`credentials(7)`. For instance, one process's thread may apply >> +Landlock rules to itself, but they will not be automatically applied to other >> +sibling threads (unlike POSIX thread credential changes, cf. >> +:manpage:`nptl(7)`). >> + >> +When a thread sandboxes itself, we have the guarantee that the related security >> +policy will stay enforced on all this thread's descendants. This allows >> +creating standalone and modular security policies per application, which will >> +automatically be composed between themselves according to their runtime parent >> +policies. >> + >> +Ptrace restrictions >> +------------------- >> + >> +A sandboxed process has less privileges than a non-sandboxed process and must >> +then be subject to additional restrictions when manipulating another process. >> +To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target >> +process, a sandboxed process should have a subset of the target process rules, >> +which means the tracee must be in a sub-domain of the tracer. >> + >> +Kernel interface >> +================ >> + >> +Access rights >> +------------- >> + >> +.. kernel-doc:: include/uapi/linux/landlock.h >> + :identifiers: fs_access >> + >> +Creating a new ruleset >> +---------------------- >> + >> +.. kernel-doc:: security/landlock/syscalls.c >> + :identifiers: sys_landlock_create_ruleset >> + >> +.. kernel-doc:: include/uapi/linux/landlock.h >> + :identifiers: landlock_ruleset_attr >> + >> +Extending a ruleset >> +------------------- >> + >> +.. kernel-doc:: security/landlock/syscalls.c >> + :identifiers: sys_landlock_add_rule >> + >> +.. kernel-doc:: include/uapi/linux/landlock.h >> + :identifiers: landlock_rule_type landlock_path_beneath_attr >> + >> +Enforcing a ruleset >> +------------------- >> + >> +.. kernel-doc:: security/landlock/syscalls.c >> + :identifiers: sys_landlock_enforce_ruleset_self >> + >> +Current limitations >> +=================== >> + >> +Ruleset layers >> +-------------- >> + >> +There is a limit of 64 layers of stacked rulesets. This can be an issue for a >> +task willing to enforce a new ruleset in complement to its 64 inherited >> +rulesets. Once this limit is reached, sys_landlock_enforce_ruleset_self() >> +returns E2BIG. It is then strongly suggested to carefully build rulesets once >> +in the life of a thread, especially for applications able to launch other >> +applications that may also want to sandbox themselves (e.g. shells, container >> +managers, etc.). >> + >> +Memory usage >> +------------ >> + >> +Kernel memory allocated to create rulesets is accounted and can be restricted >> +by the :doc:`/admin-guide/cgroup-v1/memory`. >> + >> +File renaming and linking >> +------------------------- >> + >> +Because Landlock targets unprivileged access controls, it is needed to properly >> +handle composition of rules. Such property also implies rules nesting. >> +Properly handling multiple layers of ruleset, each one of them able to restrict >> +access to files, also implies to inherit the ruleset restrictions from a parent >> +to its hierarchy. Because files are identified and restricted by their >> +hierarchy, moving or linking a file from one directory to another implies to >> +propagate the hierarchy constraints. To protect against privilege escalations >> +through renaming or linking, and for the sack of simplicity, Landlock currently >> +limits linking and renaming to the same directory. Future Landlock evolutions >> +will enable more flexibility for renaming and linking, with dedicated ruleset >> +flags. >> + >> +Filesystem layout modification >> +------------------------------ >> + >> +As for file renaming and linking, a sandboxed thread cannot modify its >> +filesystem layout, whether via :manpage:`mount(2)` or :manpage:`pivot_root(2)`. >> +However, :manpage:`chroot(2)` calls are not denied. >> + >> +Special filesystems >> +------------------- >> + >> +Access to regular files and directories can be restricted by Landlock, >> +according to the handled accesses of a ruleset. However, files that do not >> +come from a user-visible filesystem (e.g. pipe, socket), but can still be >> +accessed through /proc/self/fd/, cannot currently be restricted. Likewise, >> +some special kernel filesystems such as nsfs, which can be accessed through >> +/proc/self/ns/, cannot currently be restricted. For now, these kind of special >> +paths are then always allowed. Future Landlock evolutions will enable to >> +restrict such paths with dedicated ruleset flags. >> + >> +Questions and answers >> +===================== >> + >> +What about user space sandbox managers? >> +--------------------------------------- >> + >> +Using user space process to enforce restrictions on kernel resources can lead >> +to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of >> +the OS code and state >> +<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_). >> + >> +What about namespaces and containers? >> +------------------------------------- >> + >> +Namespaces can help create sandboxes but they are not designed for >> +access-control and then miss useful features for such use case (e.g. no >> +fine-grained restrictions). Moreover, their complexity can lead to security >> +issues, especially when untrusted processes can manipulate them (cf. >> +`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_). >> + >> +Additional documentation >> +======================== >> + >> +* :doc:`/security/landlock` >> +* https://landlock.io >> + >> +.. Links >> +.. _samples/landlock/sandboxer.c: >> + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c >> diff --git a/MAINTAINERS b/MAINTAINERS >> index 572e4288c60f..68276c73c33e 100644 >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -9942,6 +9942,8 @@ L: linux-security-module@vger.kernel.org >> S: Supported >> W: https://landlock.io >> T: git https://github.com/landlock-lsm/linux.git >> +F: Documentation/security/landlock.rst >> +F: Documentation/userspace-api/landlock.rst >> F: include/uapi/linux/landlock.h >> F: security/landlock/ >> K: landlock >> -- >> 2.30.0 >> > >
I found a corner-case which is not well handled, e.g. * layer1 only restricts X access and allows X access for /a -> X is allowed for /a * layer2 only restricts Y access and allows Y access for /a/b -> X is not allowed for /a anymore because now the path walk for /a doesn't encounter a layer 2 rule I'm working on a fix and I'll send a new patch series soon. Stay tune! On 21/01/2021 21:51, Mickaël Salaün wrote: > From: Mickaël Salaün <mic@linux.microsoft.com> > > Thanks to the Landlock objects and ruleset, it is possible to identify > inodes according to a process's domain. To enable an unprivileged > process to express a file hierarchy, it first needs to open a directory > (or a file) and pass this file descriptor to the kernel through > landlock_add_rule(2). When checking if a file access request is > allowed, we walk from the requested dentry to the real root, following > the different mount layers. The access to each "tagged" inodes are > collected according to their rule layer level, and ANDed to create > access to the requested file hierarchy. This makes possible to identify > a lot of files without tagging every inodes nor modifying the > filesystem, while still following the view and understanding the user > has from the filesystem. > > Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not > keep the same struct inodes for the same inodes whereas these inodes are > in use. > > This commit adds a minimal set of supported filesystem access-control > which doesn't enable to restrict all file-related actions. This is the > result of multiple discussions to minimize the code of Landlock to ease > review. Thanks to the Landlock design, extending this access-control > without breaking user space will not be a problem. Moreover, seccomp > filters can be used to restrict the use of syscall families which may > not be currently handled by Landlock. > > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> > Cc: James Morris <jmorris@namei.org> > Cc: Jann Horn <jannh@google.com> > Cc: Jeff Dike <jdike@addtoit.com> > Cc: Kees Cook <keescook@chromium.org> > Cc: Richard Weinberger <richard@nod.at> > Cc: Serge E. Hallyn <serge@hallyn.com> > Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> > --- > > Changes since v26: > * Check each rule of a path to enable a more permissive and pragmatic > access control per layer. Suggested by Jann Horn: > https://lore.kernel.org/lkml/CAG48ez1O0VTwEiRd3KqexoF78WR+cmP5bGk5Kh5Cs7aPepiDVg@mail.gmail.com/ > * Rename check_access_path_continue() to unmask_layers() and make it > return the new layer mask. > * Avoid double domain check in hook_file_open(). > * In the documentation, add utime(2) as another example of unhandled > syscalls. Indeed, using `touch` to test write access may be tempting. > * Remove outdated comment about OverlayFS. > * Rename the landlock.h ifdef to align with most similar files. > * Fix spelling. > > Changes since v25: > * Move build_check_layer() to ruleset.c, and add built-time checks for > the fs_access_mask and access variables according to > _LANDLOCK_ACCESS_FS_MASK. > * Move limits to a dedicated file and rename them: > _LANDLOCK_ACCESS_FS_LAST and _LANDLOCK_ACCESS_FS_MASK. > * Set build_check_layer() as non-inline to trigger a warning if it is > not called. > * Use BITS_PER_TYPE() macro. > * Rename function to landlock_add_fs_hooks(). > * Cosmetic variable renames. > > Changes since v24: > * Use the new struct landlock_rule and landlock_layer to not mix > accesses from different layers. Revert "Enforce deterministic > interleaved path rules" from v24, and fix the layer check. This > enables to follow a sane semantic: an access is granted if, for each > policy layer, at least one rule encountered on the pathwalk grants the > access, regardless of their position in the layer stack (suggested by > Jann Horn). See layout1.interleaved_masked_accesses tests from > tools/testing/selftests/landlock/fs_test.c for corner cases. > * Add build-time checks for layers. > * Use the new landlock_insert_rule() API. > > Changes since v23: > * Enforce deterministic interleaved path rules. To have consistent > layered rules, granting access to a path implies that all accesses > tied to inodes, from the requested file to the real root, must be > checked. Otherwise, stacked rules may result to overzealous > restrictions. By excluding the ability to add exceptions in the same > layer (e.g. /a allowed, /a/b denied, and /a/b/c allowed), we get > deterministic interleaved path rules. This removes an optimization > which could be replaced by a proper cache mechanism. This also > further simplifies and explain check_access_path_continue(). > * Fix memory allocation error handling in landlock_create_object() > calls. This prevent to inadvertently hold an inode. > * In get_inode_object(), improve comments, make code more readable and > move kfree() call out of the lock window. > * Use the simplified landlock_insert_rule() API. > > Changes since v22: > * Simplify check_access_path_continue() (suggested by Jann Horn). > * Remove prefetch() call for now (suggested by Jann Horn). > * Fix spelling and remove superfluous comment (spotted by Jann Horn). > * Cosmetic variable renaming. > > Changes since v21: > * Rename ARCH_EPHEMERAL_STATES to ARCH_EPHEMERAL_INODES (suggested by > James Morris). > * Remove the LANDLOCK_ACCESS_FS_CHROOT right because chroot(2) (which > requires CAP_SYS_CHROOT) doesn't enable to bypass Landlock (as tests > demonstrate it), and because it is often used by sandboxes, it would > be counterproductive to forbid it. This also reduces the code size. > * Clean up documentation. > > Changes since v19: > * Fix spelling (spotted by Randy Dunlap). > > Changes since v18: > * Remove useless include. > * Fix spelling. > > Changes since v17: > * Replace landlock_release_inodes() with security_sb_delete() (requested > by James Morris). > * Replace struct super_block->s_landlock_inode_refs with the LSM > infrastructure management of the superblock (requested by James > Morris). > * Fix mknod restriction with a zero mode (spotted by Vincent Dagonneau). > * Minimize executed code in path_mknod and file_open hooks when the > current tasks is not sandboxed. > * Remove useless checks on the file pointer and inode in > hook_file_open() . > * Constify domain pointers. > * Rename inode_landlock() to landlock_inode(). > * Import include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* from > the ruleset and domain management patch. > * Explain the rational of this minimal set of access-control. > https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ > > Changes since v16: > * Add ARCH_EPHEMERAL_STATES and enable it for UML. > > Changes since v15: > * Replace layer_levels and layer_depth with a bitfield of layers: this > enables to properly manage superset and subset of access rights, > whatever their order in the stack of layers. > Cf. https://lore.kernel.org/lkml/e07fe473-1801-01cc-12ae-b3167f95250e@digikod.net/ > * Allow to open pipes and similar special files through /proc/self/fd/. > * Properly handle internal filesystems such as nsfs: always allow these > kind of roots because disconnected path cannot be evaluated. > * Remove the LANDLOCK_ACCESS_FS_LINK_TO and > LANDLOCK_ACCESS_FS_RENAME_{TO,FROM}, but use the > LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and LANDLOCK_ACCESS_FS_MAKE_* > instead. Indeed, it is not possible for now (and not really useful) > to express the semantic of a source and a destination. > * Check access rights to remove a directory or a file with rename(2). > * Forbid reparenting when linking or renaming. This is needed to easily > protect against possible privilege escalation by changing the place of > a file or directory in relation to an enforced access policy (from the > set of layers). This will be relaxed in the future. > * Update hooks to take into account replacement of the object's self and > beneath access bitfields with one. Simplify the code. > * Check file related access rights. > * Check d_is_negative() instead of !d_backing_inode() in > check_access_path_continue(), and continue the path walk while there > is no mapped inode e.g., with rename(2). > * Check private inode in check_access_path(). > * Optimize get_file_access() when dealing with a directory. > * Add missing atomic.h . > > Changes since v14: > * Simplify the object, rule and ruleset management at the expense of a > less aggressive memory freeing (contributed by Jann Horn, with > additional modifications): > - Rewrite release_inode() to use inode->sb->s_landlock_inode_refs. > - Remove useless checks in landlock_release_inodes(), clean object > pointer according to the new struct landlock_object and wait for all > iput() to complete. > - Rewrite get_inode_object() according to the new struct > landlock_object. If there is a race-condition when cleaning up an > object, we retry until the concurrent thread finished the object > cleaning. > Cf. https://lore.kernel.org/lkml/CAG48ez21bEn0wL1bbmTiiu8j9jP5iEWtHOwz4tURUJ+ki0ydYw@mail.gmail.com/ > * Fix nested domains by implementing a notion of layer level and depth: > - Check for matching level ranges when walking through a file path. > - Only allow access if every layer granted the access request. > * Handles files without mount points (e.g. pipes). > * Hardens path walk by checking inode pointer values. > * Prefetches d_parent when walking to the root directory. > * Remove useless inode_alloc_security hook() (suggested by Jann Horn): > already initialized by lsm_inode_alloc(). > * Remove the inode_free_security hook. > * Remove access checks that may be required for FD-only requests: > truncate, getattr, lock, chmod, chown, chgrp, ioctl. This will be > handle in a future evolution of Landlock, but right now the goal is to > lighten the code to ease review. > * Constify variables. > * Move ABI checks into syscall.c . > * Cosmetic variable renames. > > Changes since v11: > * Add back, revamp and make a fully working filesystem access-control > based on paths and inodes. > * Remove the eBPF dependency. > > Previous changes: > https://lore.kernel.org/lkml/20190721213116.23476-6-mic@digikod.net/ > --- > MAINTAINERS | 1 + > arch/Kconfig | 7 + > arch/um/Kconfig | 1 + > include/uapi/linux/landlock.h | 75 ++++ > security/landlock/Kconfig | 2 +- > security/landlock/Makefile | 2 +- > security/landlock/fs.c | 621 ++++++++++++++++++++++++++++++++++ > security/landlock/fs.h | 56 +++ > security/landlock/limits.h | 4 + > security/landlock/ruleset.c | 4 + > security/landlock/setup.c | 7 + > security/landlock/setup.h | 2 + > 12 files changed, 780 insertions(+), 2 deletions(-) > create mode 100644 include/uapi/linux/landlock.h > create mode 100644 security/landlock/fs.c > create mode 100644 security/landlock/fs.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index 74406a6bc6ee..572e4288c60f 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -9942,6 +9942,7 @@ L: linux-security-module@vger.kernel.org > S: Supported > W: https://landlock.io > T: git https://github.com/landlock-lsm/linux.git > +F: include/uapi/linux/landlock.h > F: security/landlock/ > K: landlock > K: LANDLOCK > diff --git a/arch/Kconfig b/arch/Kconfig > index 24862d15f3a3..54999569c755 100644 > --- a/arch/Kconfig > +++ b/arch/Kconfig > @@ -946,6 +946,13 @@ config COMPAT_32BIT_TIME > config ARCH_NO_PREEMPT > bool > > +config ARCH_EPHEMERAL_INODES > + def_bool n > + help > + An arch should select this symbol if it doesn't keep track of inode > + instances on its own, but instead relies on something else (e.g. the host > + kernel for an UML kernel). > + > config ARCH_SUPPORTS_RT > bool > > diff --git a/arch/um/Kconfig b/arch/um/Kconfig > index 34d302d1a07f..451787332335 100644 > --- a/arch/um/Kconfig > +++ b/arch/um/Kconfig > @@ -5,6 +5,7 @@ menu "UML-specific options" > config UML > bool > default y > + select ARCH_EPHEMERAL_INODES > select ARCH_HAS_KCOV > select ARCH_NO_PREEMPT > select HAVE_ARCH_AUDITSYSCALL > diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h > new file mode 100644 > index 000000000000..f69877099c8e > --- /dev/null > +++ b/include/uapi/linux/landlock.h > @@ -0,0 +1,75 @@ > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > +/* > + * Landlock - User space API > + * > + * Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> > + * Copyright © 2018-2020 ANSSI > + */ > + > +#ifndef _UAPI_LINUX_LANDLOCK_H > +#define _UAPI_LINUX_LANDLOCK_H > + > +/** > + * DOC: fs_access > + * > + * A set of actions on kernel objects may be defined by an attribute (e.g. > + * &struct landlock_path_beneath_attr) including a bitmask of access. > + * > + * Filesystem flags > + * ~~~~~~~~~~~~~~~~ > + * > + * These flags enable to restrict a sandboxed process to a set of actions on > + * files and directories. Files or directories opened before the sandboxing > + * are not subject to these restrictions. > + * > + * A file can only receive these access rights: > + * > + * - %LANDLOCK_ACCESS_FS_EXECUTE: Execute a file. > + * - %LANDLOCK_ACCESS_FS_WRITE_FILE: Open a file with write access. > + * - %LANDLOCK_ACCESS_FS_READ_FILE: Open a file with read access. > + * > + * A directory can receive access rights related to files or directories. The > + * following access right is applied to the directory itself, and the > + * directories beneath it: > + * > + * - %LANDLOCK_ACCESS_FS_READ_DIR: Open a directory or list its content. > + * > + * However, the following access rights only apply to the content of a > + * directory, not the directory itself: > + * > + * - %LANDLOCK_ACCESS_FS_REMOVE_DIR: Remove an empty directory or rename one. > + * - %LANDLOCK_ACCESS_FS_REMOVE_FILE: Unlink (or rename) a file. > + * - %LANDLOCK_ACCESS_FS_MAKE_CHAR: Create (or rename or link) a character > + * device. > + * - %LANDLOCK_ACCESS_FS_MAKE_DIR: Create (or rename) a directory. > + * - %LANDLOCK_ACCESS_FS_MAKE_REG: Create (or rename or link) a regular file. > + * - %LANDLOCK_ACCESS_FS_MAKE_SOCK: Create (or rename or link) a UNIX domain > + * socket. > + * - %LANDLOCK_ACCESS_FS_MAKE_FIFO: Create (or rename or link) a named pipe. > + * - %LANDLOCK_ACCESS_FS_MAKE_BLOCK: Create (or rename or link) a block device. > + * - %LANDLOCK_ACCESS_FS_MAKE_SYM: Create (or rename or link) a symbolic link. > + * > + * .. warning:: > + * > + * It is currently not possible to restrict some file-related actions > + * accessible through these syscall families: :manpage:`chdir(2)`, > + * :manpage:`truncate(2)`, :manpage:`stat(2)`, :manpage:`flock(2)`, > + * :manpage:`chmod(2)`, :manpage:`chown(2)`, :manpage:`setxattr(2)`, > + * :manpage:`utime(2)`, :manpage:`ioctl(2)`, :manpage:`fcntl(2)`. > + * Future Landlock evolutions will enable to restrict them. > + */ > +#define LANDLOCK_ACCESS_FS_EXECUTE (1ULL << 0) > +#define LANDLOCK_ACCESS_FS_WRITE_FILE (1ULL << 1) > +#define LANDLOCK_ACCESS_FS_READ_FILE (1ULL << 2) > +#define LANDLOCK_ACCESS_FS_READ_DIR (1ULL << 3) > +#define LANDLOCK_ACCESS_FS_REMOVE_DIR (1ULL << 4) > +#define LANDLOCK_ACCESS_FS_REMOVE_FILE (1ULL << 5) > +#define LANDLOCK_ACCESS_FS_MAKE_CHAR (1ULL << 6) > +#define LANDLOCK_ACCESS_FS_MAKE_DIR (1ULL << 7) > +#define LANDLOCK_ACCESS_FS_MAKE_REG (1ULL << 8) > +#define LANDLOCK_ACCESS_FS_MAKE_SOCK (1ULL << 9) > +#define LANDLOCK_ACCESS_FS_MAKE_FIFO (1ULL << 10) > +#define LANDLOCK_ACCESS_FS_MAKE_BLOCK (1ULL << 11) > +#define LANDLOCK_ACCESS_FS_MAKE_SYM (1ULL << 12) > + > +#endif /* _UAPI_LINUX_LANDLOCK_H */ > diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig > index 42a659e81196..e4b9d453bc70 100644 > --- a/security/landlock/Kconfig > +++ b/security/landlock/Kconfig > @@ -2,7 +2,7 @@ > > config SECURITY_LANDLOCK > bool "Landlock support" > - depends on SECURITY > + depends on SECURITY && !ARCH_EPHEMERAL_INODES > select SECURITY_PATH > help > Landlock is a safe sandboxing mechanism that enables processes to > diff --git a/security/landlock/Makefile b/security/landlock/Makefile > index f1d1eb72fa76..92e3d80ab8ed 100644 > --- a/security/landlock/Makefile > +++ b/security/landlock/Makefile > @@ -1,4 +1,4 @@ > obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o > > landlock-y := setup.o object.o ruleset.o \ > - cred.o ptrace.o > + cred.o ptrace.o fs.o > diff --git a/security/landlock/fs.c b/security/landlock/fs.c > new file mode 100644 > index 000000000000..27f5d40038b1 > --- /dev/null > +++ b/security/landlock/fs.c > @@ -0,0 +1,621 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Landlock LSM - Filesystem management and hooks > + * > + * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net> > + * Copyright © 2018-2020 ANSSI > + */ > + > +#include <linux/atomic.h> > +#include <linux/bitops.h> > +#include <linux/bits.h> > +#include <linux/compiler_types.h> > +#include <linux/dcache.h> > +#include <linux/err.h> > +#include <linux/fs.h> > +#include <linux/init.h> > +#include <linux/kernel.h> > +#include <linux/limits.h> > +#include <linux/list.h> > +#include <linux/lsm_hooks.h> > +#include <linux/mount.h> > +#include <linux/namei.h> > +#include <linux/path.h> > +#include <linux/rcupdate.h> > +#include <linux/spinlock.h> > +#include <linux/stat.h> > +#include <linux/types.h> > +#include <linux/wait_bit.h> > +#include <linux/workqueue.h> > +#include <uapi/linux/landlock.h> > + > +#include "common.h" > +#include "cred.h" > +#include "fs.h" > +#include "limits.h" > +#include "object.h" > +#include "ruleset.h" > +#include "setup.h" > + > +/* Underlying object management */ > + > +static void release_inode(struct landlock_object *const object) > + __releases(object->lock) > +{ > + struct inode *const inode = object->underobj; > + struct super_block *sb; > + > + if (!inode) { > + spin_unlock(&object->lock); > + return; > + } > + > + spin_lock(&inode->i_lock); > + /* > + * Make sure that if the filesystem is concurrently unmounted, > + * hook_sb_delete() will wait for us to finish iput(). > + */ > + sb = inode->i_sb; > + atomic_long_inc(&landlock_superblock(sb)->inode_refs); > + rcu_assign_pointer(landlock_inode(inode)->object, NULL); > + spin_unlock(&inode->i_lock); > + spin_unlock(&object->lock); > + /* > + * Now, new rules can safely be tied to @inode. > + */ > + > + iput(inode); > + if (atomic_long_dec_and_test(&landlock_superblock(sb)->inode_refs)) > + wake_up_var(&landlock_superblock(sb)->inode_refs); > +} > + > +static const struct landlock_object_underops landlock_fs_underops = { > + .release = release_inode > +}; > + > +/* Ruleset management */ > + > +static struct landlock_object *get_inode_object(struct inode *const inode) > +{ > + struct landlock_object *object, *new_object; > + struct landlock_inode_security *inode_sec = landlock_inode(inode); > + > + rcu_read_lock(); > +retry: > + object = rcu_dereference(inode_sec->object); > + if (object) { > + if (likely(refcount_inc_not_zero(&object->usage))) { > + rcu_read_unlock(); > + return object; > + } > + /* > + * We are racing with release_inode(), the object is going > + * away. Wait for release_inode(), then retry. > + */ > + spin_lock(&object->lock); > + spin_unlock(&object->lock); > + goto retry; > + } > + rcu_read_unlock(); > + > + /* > + * If there is no object tied to @inode, then create a new one (without > + * holding any locks). > + */ > + new_object = landlock_create_object(&landlock_fs_underops, inode); > + if (IS_ERR(new_object)) > + return new_object; > + > + spin_lock(&inode->i_lock); > + object = rcu_dereference_protected(inode_sec->object, > + lockdep_is_held(&inode->i_lock)); > + if (unlikely(object)) { > + /* Someone else just created the object, bail out and retry. */ > + spin_unlock(&inode->i_lock); > + kfree(new_object); > + > + rcu_read_lock(); > + goto retry; > + } > + > + rcu_assign_pointer(inode_sec->object, new_object); > + /* > + * @inode will be released by hook_sb_delete() on its superblock > + * shutdown. > + */ > + ihold(inode); > + spin_unlock(&inode->i_lock); > + return new_object; > +} > + > +/* All access rights that can be tied to files. */ > +#define ACCESS_FILE ( \ > + LANDLOCK_ACCESS_FS_EXECUTE | \ > + LANDLOCK_ACCESS_FS_WRITE_FILE | \ > + LANDLOCK_ACCESS_FS_READ_FILE) > + > +/* > + * @path: Should have been checked by get_path_from_fd(). > + */ > +int landlock_append_fs_rule(struct landlock_ruleset *const ruleset, > + const struct path *const path, u32 access_rights) > +{ > + int err; > + struct landlock_object *object; > + > + /* Files only get access rights that make sense. */ > + if (!d_is_dir(path->dentry) && (access_rights | ACCESS_FILE) != > + ACCESS_FILE) > + return -EINVAL; > + > + /* Transforms relative access rights to absolute ones. */ > + access_rights |= LANDLOCK_MASK_ACCESS_FS & ~ruleset->fs_access_mask; > + object = get_inode_object(d_backing_inode(path->dentry)); > + if (IS_ERR(object)) > + return PTR_ERR(object); > + mutex_lock(&ruleset->lock); > + err = landlock_insert_rule(ruleset, object, access_rights); > + mutex_unlock(&ruleset->lock); > + /* > + * No need to check for an error because landlock_insert_rule() > + * increments the refcount for the new object if needed. > + */ > + landlock_put_object(object); > + return err; > +} > + > +/* Access-control management */ > + > +static inline u64 unmask_layers( > + const struct landlock_ruleset *const domain, > + const struct path *const path, const u32 access_request, > + u64 layer_mask) > +{ > + const struct landlock_rule *rule; > + const struct inode *inode; > + size_t i; > + > + if (d_is_negative(path->dentry)) > + /* Continues to walk while there is no mapped inode. */ > + return layer_mask; > + inode = d_backing_inode(path->dentry); > + rcu_read_lock(); > + rule = landlock_find_rule(domain, > + rcu_dereference(landlock_inode(inode)->object)); > + rcu_read_unlock(); > + if (!rule) > + return layer_mask; > + > + /* > + * An access is granted if, for each policy layer, at least one rule > + * encountered on the pathwalk grants the requested accesses, > + * regardless of their position in the layer stack. We must then check > + * the remaining layers for each inode, from the last added layer to > + * the first one. > + */ > + for (i = 0; i < rule->num_layers; i++) { > + const struct landlock_layer *const layer = &rule->layers[i]; > + const u64 layer_level = BIT_ULL(layer->level - 1); > + > + if ((layer->access & access_request) == access_request) { > + layer_mask &= ~layer_level; > + > + if (layer_mask == 0) > + return layer_mask; > + } > + } > + return layer_mask; > +} > + > +static int check_access_path(const struct landlock_ruleset *const domain, > + const struct path *const path, u32 access_request) > +{ > + bool allowed = false; > + struct path walker_path; > + u64 layer_mask; > + > + /* Make sure all layers can be checked. */ > + BUILD_BUG_ON(BITS_PER_TYPE(layer_mask) < LANDLOCK_MAX_NUM_LAYERS); > + > + if (WARN_ON_ONCE(!domain || !path)) > + return 0; > + /* > + * Allows access to pseudo filesystems that will never be mountable > + * (e.g. sockfs, pipefs), but can still be reachable through > + * /proc/self/fd . > + */ > + if ((path->dentry->d_sb->s_flags & SB_NOUSER) || > + (d_is_positive(path->dentry) && > + unlikely(IS_PRIVATE(d_backing_inode(path->dentry))))) > + return 0; > + if (WARN_ON_ONCE(domain->num_layers < 1)) > + return -EACCES; > + > + layer_mask = GENMASK_ULL(domain->num_layers - 1, 0); > + /* > + * An access request that is not handled by the domain should be > + * allowed. > + */ > + access_request &= domain->fs_access_mask; > + if (access_request == 0) > + return 0; > + walker_path = *path; > + path_get(&walker_path); > + /* > + * We need to walk through all the hierarchy to not miss any relevant > + * restriction. > + */ > + while (true) { > + struct dentry *parent_dentry; > + > + layer_mask = unmask_layers(domain, &walker_path, > + access_request, layer_mask); > + if (layer_mask == 0) { > + /* Stops when a rule from each layer grants access. */ > + allowed = true; > + break; > + } > + > +jump_up: > + if (walker_path.dentry == walker_path.mnt->mnt_root) { > + if (follow_up(&walker_path)) { > + /* Ignores hidden mount points. */ > + goto jump_up; > + } else { > + /* > + * Stops at the real root. Denies access > + * because not all layers have granted access. > + */ > + allowed = false; > + break; > + } > + } > + if (unlikely(IS_ROOT(walker_path.dentry))) { > + /* > + * Stops at disconnected root directories. Only allows > + * access to internal filesystems (e.g. nsfs, which is > + * reachable through /proc/self/ns). > + */ > + allowed = !!(walker_path.mnt->mnt_flags & MNT_INTERNAL); > + break; > + } > + parent_dentry = dget_parent(walker_path.dentry); > + dput(walker_path.dentry); > + walker_path.dentry = parent_dentry; > + } > + path_put(&walker_path); > + return allowed ? 0 : -EACCES; > +} > + > +static inline int current_check_access_path(const struct path *const path, > + const u32 access_request) > +{ > + const struct landlock_ruleset *const dom = > + landlock_get_current_domain(); > + > + if (!dom) > + return 0; > + return check_access_path(dom, path, access_request); > +} > + > +/* Super-block hooks */ > + > +/* > + * Release the inodes used in a security policy. > + * > + * Cf. fsnotify_unmount_inodes() > + */ > +static void hook_sb_delete(struct super_block *const sb) > +{ > + struct inode *inode, *iput_inode = NULL; > + > + if (!landlock_initialized) > + return; > + > + spin_lock(&sb->s_inode_list_lock); > + list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { > + struct landlock_inode_security *inode_sec = > + landlock_inode(inode); > + struct landlock_object *object; > + bool do_put = false; > + > + rcu_read_lock(); > + object = rcu_dereference(inode_sec->object); > + if (!object) { > + rcu_read_unlock(); > + continue; > + } > + > + spin_lock(&object->lock); > + if (object->underobj) { > + object->underobj = NULL; > + do_put = true; > + spin_lock(&inode->i_lock); > + rcu_assign_pointer(inode_sec->object, NULL); > + spin_unlock(&inode->i_lock); > + } > + spin_unlock(&object->lock); > + rcu_read_unlock(); > + if (!do_put) > + /* > + * A concurrent iput() in release_inode() is ongoing > + * and we will just wait for it to finish. > + */ > + continue; > + > + /* > + * At this point, we own the ihold() reference that was > + * originally set up by get_inode_object(). Therefore we can > + * drop the list lock and know that the inode won't disappear > + * from under us until the next loop walk. > + */ > + spin_unlock(&sb->s_inode_list_lock); > + /* > + * We can now actually put the previous inode, which is not > + * needed anymore for the loop walk. > + */ > + if (iput_inode) > + iput(iput_inode); > + iput_inode = inode; > + spin_lock(&sb->s_inode_list_lock); > + } > + spin_unlock(&sb->s_inode_list_lock); > + if (iput_inode) > + iput(iput_inode); > + > + /* > + * Wait for pending iput() in release_inode(). > + */ > + wait_var_event(&landlock_superblock(sb)->inode_refs, !atomic_long_read( > + &landlock_superblock(sb)->inode_refs)); > +} > + > +/* > + * Because a Landlock security policy is defined according to the filesystem > + * layout (i.e. the mount namespace), changing it may grant access to files not > + * previously allowed. > + * > + * To make it simple, deny any filesystem layout modification by landlocked > + * processes. Non-landlocked processes may still change the namespace of a > + * landlocked process, but this kind of threat must be handled by a system-wide > + * access-control security policy. > + * > + * This could be lifted in the future if Landlock can safely handle mount > + * namespace updates requested by a landlocked process. Indeed, we could > + * update the current domain (which is currently read-only) by taking into > + * account the accesses of the source and the destination of a new mount point. > + * However, it would also require to make all the child domains dynamically > + * inherit these new constraints. Anyway, for backward compatibility reasons, > + * a dedicated user space option would be required (e.g. as a ruleset command > + * option). > + */ > +static int hook_sb_mount(const char *const dev_name, > + const struct path *const path, const char *const type, > + const unsigned long flags, void *const data) > +{ > + if (!landlock_get_current_domain()) > + return 0; > + return -EPERM; > +} > + > +static int hook_move_mount(const struct path *const from_path, > + const struct path *const to_path) > +{ > + if (!landlock_get_current_domain()) > + return 0; > + return -EPERM; > +} > + > +/* > + * Removing a mount point may reveal a previously hidden file hierarchy, which > + * may then grant access to files, which may have previously been forbidden. > + */ > +static int hook_sb_umount(struct vfsmount *const mnt, const int flags) > +{ > + if (!landlock_get_current_domain()) > + return 0; > + return -EPERM; > +} > + > +static int hook_sb_remount(struct super_block *const sb, void *const mnt_opts) > +{ > + if (!landlock_get_current_domain()) > + return 0; > + return -EPERM; > +} > + > +/* > + * pivot_root(2), like mount(2), changes the current mount namespace. It must > + * then be forbidden for a landlocked process. > + * > + * However, chroot(2) may be allowed because it only changes the relative root > + * directory of the current process. Moreover, it can be used to restrict the > + * view of the filesystem. > + */ > +static int hook_sb_pivotroot(const struct path *const old_path, > + const struct path *const new_path) > +{ > + if (!landlock_get_current_domain()) > + return 0; > + return -EPERM; > +} > + > +/* Path hooks */ > + > +static inline u32 get_mode_access(const umode_t mode) > +{ > + switch (mode & S_IFMT) { > + case S_IFLNK: > + return LANDLOCK_ACCESS_FS_MAKE_SYM; > + case 0: > + /* A zero mode translates to S_IFREG. */ > + case S_IFREG: > + return LANDLOCK_ACCESS_FS_MAKE_REG; > + case S_IFDIR: > + return LANDLOCK_ACCESS_FS_MAKE_DIR; > + case S_IFCHR: > + return LANDLOCK_ACCESS_FS_MAKE_CHAR; > + case S_IFBLK: > + return LANDLOCK_ACCESS_FS_MAKE_BLOCK; > + case S_IFIFO: > + return LANDLOCK_ACCESS_FS_MAKE_FIFO; > + case S_IFSOCK: > + return LANDLOCK_ACCESS_FS_MAKE_SOCK; > + default: > + WARN_ON_ONCE(1); > + return 0; > + } > +} > + > +/* > + * Creating multiple links or renaming may lead to privilege escalations if not > + * handled properly. Indeed, we must be sure that the source doesn't gain more > + * privileges by being accessible from the destination. This is getting more > + * complex when dealing with multiple layers. The whole picture can be seen as > + * a multilayer partial ordering problem. A future version of Landlock will > + * deal with that. > + */ > +static int hook_path_link(struct dentry *const old_dentry, > + const struct path *const new_dir, > + struct dentry *const new_dentry) > +{ > + const struct landlock_ruleset *const dom = > + landlock_get_current_domain(); > + > + if (!dom) > + return 0; > + /* The mount points are the same for old and new paths, cf. EXDEV. */ > + if (old_dentry->d_parent != new_dir->dentry) > + /* For now, forbid reparenting. */ > + return -EACCES; > + if (unlikely(d_is_negative(old_dentry))) > + return -EACCES; > + return check_access_path(dom, new_dir, > + get_mode_access(d_backing_inode(old_dentry)->i_mode)); > +} > + > +static inline u32 maybe_remove(const struct dentry *const dentry) > +{ > + if (d_is_negative(dentry)) > + return 0; > + return d_is_dir(dentry) ? LANDLOCK_ACCESS_FS_REMOVE_DIR : > + LANDLOCK_ACCESS_FS_REMOVE_FILE; > +} > + > +static int hook_path_rename(const struct path *const old_dir, > + struct dentry *const old_dentry, > + const struct path *const new_dir, > + struct dentry *const new_dentry) > +{ > + const struct landlock_ruleset *const dom = > + landlock_get_current_domain(); > + > + if (!dom) > + return 0; > + /* The mount points are the same for old and new paths, cf. EXDEV. */ > + if (old_dir->dentry != new_dir->dentry) > + /* For now, forbid reparenting. */ > + return -EACCES; > + if (WARN_ON_ONCE(d_is_negative(old_dentry))) > + return -EACCES; > + /* RENAME_EXCHANGE is handled because directories are the same. */ > + return check_access_path(dom, old_dir, maybe_remove(old_dentry) | > + maybe_remove(new_dentry) | > + get_mode_access(d_backing_inode(old_dentry)->i_mode)); > +} > + > +static int hook_path_mkdir(const struct path *const dir, > + struct dentry *const dentry, const umode_t mode) > +{ > + return current_check_access_path(dir, LANDLOCK_ACCESS_FS_MAKE_DIR); > +} > + > +static int hook_path_mknod(const struct path *const dir, > + struct dentry *const dentry, const umode_t mode, > + const unsigned int dev) > +{ > + const struct landlock_ruleset *const dom = > + landlock_get_current_domain(); > + > + if (!dom) > + return 0; > + return check_access_path(dom, dir, get_mode_access(mode)); > +} > + > +static int hook_path_symlink(const struct path *const dir, > + struct dentry *const dentry, const char *const old_name) > +{ > + return current_check_access_path(dir, LANDLOCK_ACCESS_FS_MAKE_SYM); > +} > + > +static int hook_path_unlink(const struct path *const dir, > + struct dentry *const dentry) > +{ > + return current_check_access_path(dir, LANDLOCK_ACCESS_FS_REMOVE_FILE); > +} > + > +static int hook_path_rmdir(const struct path *const dir, > + struct dentry *const dentry) > +{ > + return current_check_access_path(dir, LANDLOCK_ACCESS_FS_REMOVE_DIR); > +} > + > +/* File hooks */ > + > +static inline u32 get_file_access(const struct file *const file) > +{ > + u32 access = 0; > + > + if (file->f_mode & FMODE_READ) { > + /* A directory can only be opened in read mode. */ > + if (S_ISDIR(file_inode(file)->i_mode)) > + return LANDLOCK_ACCESS_FS_READ_DIR; > + access = LANDLOCK_ACCESS_FS_READ_FILE; > + } > + if (file->f_mode & FMODE_WRITE) > + access |= LANDLOCK_ACCESS_FS_WRITE_FILE; > + /* __FMODE_EXEC is indeed part of f_flags, not f_mode. */ > + if (file->f_flags & __FMODE_EXEC) > + access |= LANDLOCK_ACCESS_FS_EXECUTE; > + return access; > +} > + > +static int hook_file_open(struct file *const file) > +{ > + const struct landlock_ruleset *const dom = > + landlock_get_current_domain(); > + > + if (!dom) > + return 0; > + /* > + * Because a file may be opened with O_PATH, get_file_access() may > + * return 0. This case will be handled with a future Landlock > + * evolution. > + */ > + return check_access_path(dom, &file->f_path, get_file_access(file)); > +} > + > +static struct security_hook_list landlock_hooks[] __lsm_ro_after_init = { > + LSM_HOOK_INIT(sb_delete, hook_sb_delete), > + LSM_HOOK_INIT(sb_mount, hook_sb_mount), > + LSM_HOOK_INIT(move_mount, hook_move_mount), > + LSM_HOOK_INIT(sb_umount, hook_sb_umount), > + LSM_HOOK_INIT(sb_remount, hook_sb_remount), > + LSM_HOOK_INIT(sb_pivotroot, hook_sb_pivotroot), > + > + LSM_HOOK_INIT(path_link, hook_path_link), > + LSM_HOOK_INIT(path_rename, hook_path_rename), > + LSM_HOOK_INIT(path_mkdir, hook_path_mkdir), > + LSM_HOOK_INIT(path_mknod, hook_path_mknod), > + LSM_HOOK_INIT(path_symlink, hook_path_symlink), > + LSM_HOOK_INIT(path_unlink, hook_path_unlink), > + LSM_HOOK_INIT(path_rmdir, hook_path_rmdir), > + > + LSM_HOOK_INIT(file_open, hook_file_open), > +}; > + > +__init void landlock_add_fs_hooks(void) > +{ > + security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks), > + LANDLOCK_NAME); > +} > diff --git a/security/landlock/fs.h b/security/landlock/fs.h > new file mode 100644 > index 000000000000..9f14ec4d8d48 > --- /dev/null > +++ b/security/landlock/fs.h > @@ -0,0 +1,56 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * Landlock LSM - Filesystem management and hooks > + * > + * Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> > + * Copyright © 2018-2020 ANSSI > + */ > + > +#ifndef _SECURITY_LANDLOCK_FS_H > +#define _SECURITY_LANDLOCK_FS_H > + > +#include <linux/fs.h> > +#include <linux/init.h> > +#include <linux/rcupdate.h> > + > +#include "ruleset.h" > +#include "setup.h" > + > +struct landlock_inode_security { > + /* > + * @object: Weak pointer to an allocated object. All writes (i.e. > + * creating a new object or removing one) are protected by the > + * underlying inode->i_lock. Disassociating @object from the inode is > + * additionally protected by @object->lock, from the time @object's > + * usage refcount drops to zero to the time this pointer is nulled out. > + * Cf. release_inode(). > + */ > + struct landlock_object __rcu *object; > +}; > + > +struct landlock_superblock_security { > + /* > + * @inode_refs: References to Landlock underlying objects. > + * Cf. struct super_block->s_fsnotify_inode_refs . > + */ > + atomic_long_t inode_refs; > +}; > + > +static inline struct landlock_inode_security *landlock_inode( > + const struct inode *const inode) > +{ > + return inode->i_security + landlock_blob_sizes.lbs_inode; > +} > + > +static inline struct landlock_superblock_security *landlock_superblock( > + const struct super_block *const superblock) > +{ > + return superblock->s_security + landlock_blob_sizes.lbs_superblock; > +} > + > +__init void landlock_add_fs_hooks(void); > + > +int landlock_append_fs_rule(struct landlock_ruleset *const ruleset, > + const struct path *const path, u32 access_hierarchy); > + > +#endif /* _SECURITY_LANDLOCK_FS_H */ > diff --git a/security/landlock/limits.h b/security/landlock/limits.h > index b734f597bb0e..2a0a1095ee27 100644 > --- a/security/landlock/limits.h > +++ b/security/landlock/limits.h > @@ -10,8 +10,12 @@ > #define _SECURITY_LANDLOCK_LIMITS_H > > #include <linux/limits.h> > +#include <uapi/linux/landlock.h> > > #define LANDLOCK_MAX_NUM_LAYERS 64 > #define LANDLOCK_MAX_NUM_RULES U32_MAX > > +#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM > +#define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) > + > #endif /* _SECURITY_LANDLOCK_LIMITS_H */ > diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c > index bf7ff66c1b12..548636a68b48 100644 > --- a/security/landlock/ruleset.c > +++ b/security/landlock/ruleset.c > @@ -112,10 +112,12 @@ static void build_check_ruleset(void) > const struct landlock_ruleset ruleset = { > .num_rules = ~0, > .num_layers = ~0, > + .fs_access_mask = ~0, > }; > > BUILD_BUG_ON(ruleset.num_rules < LANDLOCK_MAX_NUM_RULES); > BUILD_BUG_ON(ruleset.num_layers < LANDLOCK_MAX_NUM_LAYERS); > + BUILD_BUG_ON(ruleset.fs_access_mask < LANDLOCK_MASK_ACCESS_FS); > } > > /** > @@ -214,9 +216,11 @@ static void build_check_layer(void) > { > const struct landlock_layer layer = { > .level = ~0, > + .access = ~0, > }; > > BUILD_BUG_ON(layer.level < LANDLOCK_MAX_NUM_LAYERS); > + BUILD_BUG_ON(layer.access < LANDLOCK_MASK_ACCESS_FS); > } > > int landlock_insert_rule(struct landlock_ruleset *const ruleset, > diff --git a/security/landlock/setup.c b/security/landlock/setup.c > index a5d6ef334991..f8e8e980454c 100644 > --- a/security/landlock/setup.c > +++ b/security/landlock/setup.c > @@ -11,17 +11,24 @@ > > #include "common.h" > #include "cred.h" > +#include "fs.h" > #include "ptrace.h" > #include "setup.h" > > +bool landlock_initialized __lsm_ro_after_init = false; > + > struct lsm_blob_sizes landlock_blob_sizes __lsm_ro_after_init = { > .lbs_cred = sizeof(struct landlock_cred_security), > + .lbs_inode = sizeof(struct landlock_inode_security), > + .lbs_superblock = sizeof(struct landlock_superblock_security), > }; > > static int __init landlock_init(void) > { > landlock_add_cred_hooks(); > landlock_add_ptrace_hooks(); > + landlock_add_fs_hooks(); > + landlock_initialized = true; > pr_info("Up and running.\n"); > return 0; > } > diff --git a/security/landlock/setup.h b/security/landlock/setup.h > index 9fdbf33fcc33..1daffab1ab4b 100644 > --- a/security/landlock/setup.h > +++ b/security/landlock/setup.h > @@ -11,6 +11,8 @@ > > #include <linux/lsm_hooks.h> > > +extern bool landlock_initialized; > + > extern struct lsm_blob_sizes landlock_blob_sizes; > > #endif /* _SECURITY_LANDLOCK_SETUP_H */ >