Message ID | 20201209192839.1396820-8-mic@digikod.net |
---|---|
State | Superseded |
Headers | show |
Series | None | expand |
On Wed, Dec 9, 2020 at 8:28 PM Mickaël Salaün <mic@digikod.net> wrote: > Thanks to the Landlock objects and ruleset, it is possible to identify > inodes according to a process's domain. To enable an unprivileged > process to express a file hierarchy, it first needs to open a directory > (or a file) and pass this file descriptor to the kernel through > landlock_add_rule(2). When checking if a file access request is > allowed, we walk from the requested dentry to the real root, following > the different mount layers. The access to each "tagged" inodes are > collected according to their rule layer level, and ANDed to create > access to the requested file hierarchy. This makes possible to identify > a lot of files without tagging every inodes nor modifying the > filesystem, while still following the view and understanding the user > has from the filesystem. > > Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not > keep the same struct inodes for the same inodes whereas these inodes are > in use. > > This commit adds a minimal set of supported filesystem access-control > which doesn't enable to restrict all file-related actions. This is the > result of multiple discussions to minimize the code of Landlock to ease > review. Thanks to the Landlock design, extending this access-control > without breaking user space will not be a problem. Moreover, seccomp > filters can be used to restrict the use of syscall families which may > not be currently handled by Landlock. [...] > +static bool check_access_path_continue( > + const struct landlock_ruleset *const domain, > + const struct path *const path, const u32 access_request, > + u64 *const layer_mask) > +{ [...] > + /* > + * An access is granted if, for each policy layer, at least one rule > + * encountered on the pathwalk grants the access, regardless of their > + * position in the layer stack. We must then check not-yet-seen layers > + * for each inode, from the last one added to the first one. > + */ > + for (i = 0; i < rule->num_layers; i++) { > + const struct landlock_layer *const layer = &rule->layers[i]; > + const u64 layer_level = BIT_ULL(layer->level - 1); > + > + if (!(layer_level & *layer_mask)) > + continue; > + if ((layer->access & access_request) != access_request) > + return false; > + *layer_mask &= ~layer_level; Hmm... shouldn't the last 5 lines be replaced by the following? if ((layer->access & access_request) == access_request) *layer_mask &= ~layer_level; And then, since this function would always return true, you could change its return type to "void". As far as I can tell, the current version will still, if a ruleset looks like this: /usr read+write /usr/lib/ read reject write access to /usr/lib, right? > + } > + return true; > +}
On 14/01/2021 04:22, Jann Horn wrote: > On Wed, Dec 9, 2020 at 8:28 PM Mickaël Salaün <mic@digikod.net> wrote: >> Thanks to the Landlock objects and ruleset, it is possible to identify >> inodes according to a process's domain. To enable an unprivileged >> process to express a file hierarchy, it first needs to open a directory >> (or a file) and pass this file descriptor to the kernel through >> landlock_add_rule(2). When checking if a file access request is >> allowed, we walk from the requested dentry to the real root, following >> the different mount layers. The access to each "tagged" inodes are >> collected according to their rule layer level, and ANDed to create >> access to the requested file hierarchy. This makes possible to identify >> a lot of files without tagging every inodes nor modifying the >> filesystem, while still following the view and understanding the user >> has from the filesystem. >> >> Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not >> keep the same struct inodes for the same inodes whereas these inodes are >> in use. >> >> This commit adds a minimal set of supported filesystem access-control >> which doesn't enable to restrict all file-related actions. This is the >> result of multiple discussions to minimize the code of Landlock to ease >> review. Thanks to the Landlock design, extending this access-control >> without breaking user space will not be a problem. Moreover, seccomp >> filters can be used to restrict the use of syscall families which may >> not be currently handled by Landlock. > [...] >> +static bool check_access_path_continue( >> + const struct landlock_ruleset *const domain, >> + const struct path *const path, const u32 access_request, >> + u64 *const layer_mask) >> +{ > [...] >> + /* >> + * An access is granted if, for each policy layer, at least one rule >> + * encountered on the pathwalk grants the access, regardless of their >> + * position in the layer stack. We must then check not-yet-seen layers >> + * for each inode, from the last one added to the first one. >> + */ >> + for (i = 0; i < rule->num_layers; i++) { >> + const struct landlock_layer *const layer = &rule->layers[i]; >> + const u64 layer_level = BIT_ULL(layer->level - 1); >> + >> + if (!(layer_level & *layer_mask)) >> + continue; >> + if ((layer->access & access_request) != access_request) >> + return false; >> + *layer_mask &= ~layer_level; > > Hmm... shouldn't the last 5 lines be replaced by the following? > > if ((layer->access & access_request) == access_request) > *layer_mask &= ~layer_level; > > And then, since this function would always return true, you could > change its return type to "void". > > > As far as I can tell, the current version will still, if a ruleset > looks like this: > > /usr read+write > /usr/lib/ read > > reject write access to /usr/lib, right? If these two rules are from different layers, then yes it would work as intended. However, if these rules are from the same layer the path walk will not stop at /usr/lib but go down to /usr, which grants write access. This is the reason I wrote it like this and the layout1.inherit_subset test checks that. I'm updating the documentation to better explain how an access is checked with one or multiple layers. Doing this way also enables to stop the path walk earlier, which is the original purpose of this function. > > >> + } >> + return true; >> +}
On Thu, Jan 14, 2021 at 7:54 PM Mickaël Salaün <mic@digikod.net> wrote: > On 14/01/2021 04:22, Jann Horn wrote: > > On Wed, Dec 9, 2020 at 8:28 PM Mickaël Salaün <mic@digikod.net> wrote: > >> Thanks to the Landlock objects and ruleset, it is possible to identify > >> inodes according to a process's domain. To enable an unprivileged > >> process to express a file hierarchy, it first needs to open a directory > >> (or a file) and pass this file descriptor to the kernel through > >> landlock_add_rule(2). When checking if a file access request is > >> allowed, we walk from the requested dentry to the real root, following > >> the different mount layers. The access to each "tagged" inodes are > >> collected according to their rule layer level, and ANDed to create > >> access to the requested file hierarchy. This makes possible to identify > >> a lot of files without tagging every inodes nor modifying the > >> filesystem, while still following the view and understanding the user > >> has from the filesystem. > >> > >> Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not > >> keep the same struct inodes for the same inodes whereas these inodes are > >> in use. > >> > >> This commit adds a minimal set of supported filesystem access-control > >> which doesn't enable to restrict all file-related actions. This is the > >> result of multiple discussions to minimize the code of Landlock to ease > >> review. Thanks to the Landlock design, extending this access-control > >> without breaking user space will not be a problem. Moreover, seccomp > >> filters can be used to restrict the use of syscall families which may > >> not be currently handled by Landlock. > > [...] > >> +static bool check_access_path_continue( > >> + const struct landlock_ruleset *const domain, > >> + const struct path *const path, const u32 access_request, > >> + u64 *const layer_mask) > >> +{ > > [...] > >> + /* > >> + * An access is granted if, for each policy layer, at least one rule > >> + * encountered on the pathwalk grants the access, regardless of their > >> + * position in the layer stack. We must then check not-yet-seen layers > >> + * for each inode, from the last one added to the first one. > >> + */ > >> + for (i = 0; i < rule->num_layers; i++) { > >> + const struct landlock_layer *const layer = &rule->layers[i]; > >> + const u64 layer_level = BIT_ULL(layer->level - 1); > >> + > >> + if (!(layer_level & *layer_mask)) > >> + continue; > >> + if ((layer->access & access_request) != access_request) > >> + return false; > >> + *layer_mask &= ~layer_level; > > > > Hmm... shouldn't the last 5 lines be replaced by the following? > > > > if ((layer->access & access_request) == access_request) > > *layer_mask &= ~layer_level; > > > > And then, since this function would always return true, you could > > change its return type to "void". > > > > > > As far as I can tell, the current version will still, if a ruleset > > looks like this: > > > > /usr read+write > > /usr/lib/ read > > > > reject write access to /usr/lib, right? > > If these two rules are from different layers, then yes it would work as > intended. However, if these rules are from the same layer the path walk > will not stop at /usr/lib but go down to /usr, which grants write > access. I don't see why the code would do what you're saying it does. And an experiment seems to confirm what I said; I checked out landlock-v26, and the behavior I get is: user@vm:~/landlock$ dd if=/dev/null of=/tmp/aaa 0+0 records in 0+0 records out 0 bytes copied, 0.00106365 s, 0.0 kB/s user@vm:~/landlock$ LL_FS_RO='/lib' LL_FS_RW='/' ./sandboxer dd if=/dev/null of=/tmp/aaa 0+0 records in 0+0 records out 0 bytes copied, 0.000491814 s, 0.0 kB/s user@vm:~/landlock$ LL_FS_RO='/tmp' LL_FS_RW='/' ./sandboxer dd if=/dev/null of=/tmp/aaa dd: failed to open '/tmp/aaa': Permission denied user@vm:~/landlock$ Granting read access to /tmp prevents writing to it, even though write access was granted to /.
On 14/01/2021 23:43, Jann Horn wrote: > On Thu, Jan 14, 2021 at 7:54 PM Mickaël Salaün <mic@digikod.net> wrote: >> On 14/01/2021 04:22, Jann Horn wrote: >>> On Wed, Dec 9, 2020 at 8:28 PM Mickaël Salaün <mic@digikod.net> wrote: >>>> Thanks to the Landlock objects and ruleset, it is possible to identify >>>> inodes according to a process's domain. To enable an unprivileged >>>> process to express a file hierarchy, it first needs to open a directory >>>> (or a file) and pass this file descriptor to the kernel through >>>> landlock_add_rule(2). When checking if a file access request is >>>> allowed, we walk from the requested dentry to the real root, following >>>> the different mount layers. The access to each "tagged" inodes are >>>> collected according to their rule layer level, and ANDed to create >>>> access to the requested file hierarchy. This makes possible to identify >>>> a lot of files without tagging every inodes nor modifying the >>>> filesystem, while still following the view and understanding the user >>>> has from the filesystem. >>>> >>>> Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not >>>> keep the same struct inodes for the same inodes whereas these inodes are >>>> in use. >>>> >>>> This commit adds a minimal set of supported filesystem access-control >>>> which doesn't enable to restrict all file-related actions. This is the >>>> result of multiple discussions to minimize the code of Landlock to ease >>>> review. Thanks to the Landlock design, extending this access-control >>>> without breaking user space will not be a problem. Moreover, seccomp >>>> filters can be used to restrict the use of syscall families which may >>>> not be currently handled by Landlock. >>> [...] >>>> +static bool check_access_path_continue( >>>> + const struct landlock_ruleset *const domain, >>>> + const struct path *const path, const u32 access_request, >>>> + u64 *const layer_mask) >>>> +{ >>> [...] >>>> + /* >>>> + * An access is granted if, for each policy layer, at least one rule >>>> + * encountered on the pathwalk grants the access, regardless of their >>>> + * position in the layer stack. We must then check not-yet-seen layers >>>> + * for each inode, from the last one added to the first one. >>>> + */ >>>> + for (i = 0; i < rule->num_layers; i++) { >>>> + const struct landlock_layer *const layer = &rule->layers[i]; >>>> + const u64 layer_level = BIT_ULL(layer->level - 1); >>>> + >>>> + if (!(layer_level & *layer_mask)) >>>> + continue; >>>> + if ((layer->access & access_request) != access_request) >>>> + return false; >>>> + *layer_mask &= ~layer_level; >>> >>> Hmm... shouldn't the last 5 lines be replaced by the following? >>> >>> if ((layer->access & access_request) == access_request) >>> *layer_mask &= ~layer_level; >>> >>> And then, since this function would always return true, you could >>> change its return type to "void". >>> >>> >>> As far as I can tell, the current version will still, if a ruleset >>> looks like this: >>> >>> /usr read+write >>> /usr/lib/ read >>> >>> reject write access to /usr/lib, right? >> >> If these two rules are from different layers, then yes it would work as >> intended. However, if these rules are from the same layer the path walk >> will not stop at /usr/lib but go down to /usr, which grants write >> access. > > I don't see why the code would do what you're saying it does. And an > experiment seems to confirm what I said; I checked out landlock-v26, > and the behavior I get is: There is a misunderstanding, I was responding to your proposition to modify check_access_path_continue(), not about the behavior of landlock-v26. > > user@vm:~/landlock$ dd if=/dev/null of=/tmp/aaa > 0+0 records in > 0+0 records out > 0 bytes copied, 0.00106365 s, 0.0 kB/s > user@vm:~/landlock$ LL_FS_RO='/lib' LL_FS_RW='/' ./sandboxer dd > if=/dev/null of=/tmp/aaa > 0+0 records in > 0+0 records out > 0 bytes copied, 0.000491814 s, 0.0 kB/s > user@vm:~/landlock$ LL_FS_RO='/tmp' LL_FS_RW='/' ./sandboxer dd > if=/dev/null of=/tmp/aaa > dd: failed to open '/tmp/aaa': Permission denied > user@vm:~/landlock$ > > Granting read access to /tmp prevents writing to it, even though write > access was granted to /. > It indeed works like this with landlock-v26. However, with your above proposition, it would work like this: $ LL_FS_RO='/tmp' LL_FS_RW='/' ./sandboxer dd if=/dev/null of=/tmp/aaa 0+0 records in 0+0 records out 0 bytes copied, 0.000187265 s, 0.0 kB/s …which is not what users would expect I guess. :)
On Fri, Jan 15, 2021 at 10:10 AM Mickaël Salaün <mic@digikod.net> wrote: > On 14/01/2021 23:43, Jann Horn wrote: > > On Thu, Jan 14, 2021 at 7:54 PM Mickaël Salaün <mic@digikod.net> wrote: > >> On 14/01/2021 04:22, Jann Horn wrote: > >>> On Wed, Dec 9, 2020 at 8:28 PM Mickaël Salaün <mic@digikod.net> wrote: > >>>> Thanks to the Landlock objects and ruleset, it is possible to identify > >>>> inodes according to a process's domain. To enable an unprivileged > >>>> process to express a file hierarchy, it first needs to open a directory > >>>> (or a file) and pass this file descriptor to the kernel through > >>>> landlock_add_rule(2). When checking if a file access request is > >>>> allowed, we walk from the requested dentry to the real root, following > >>>> the different mount layers. The access to each "tagged" inodes are > >>>> collected according to their rule layer level, and ANDed to create > >>>> access to the requested file hierarchy. This makes possible to identify > >>>> a lot of files without tagging every inodes nor modifying the > >>>> filesystem, while still following the view and understanding the user > >>>> has from the filesystem. > >>>> > >>>> Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not > >>>> keep the same struct inodes for the same inodes whereas these inodes are > >>>> in use. > >>>> > >>>> This commit adds a minimal set of supported filesystem access-control > >>>> which doesn't enable to restrict all file-related actions. This is the > >>>> result of multiple discussions to minimize the code of Landlock to ease > >>>> review. Thanks to the Landlock design, extending this access-control > >>>> without breaking user space will not be a problem. Moreover, seccomp > >>>> filters can be used to restrict the use of syscall families which may > >>>> not be currently handled by Landlock. > >>> [...] > >>>> +static bool check_access_path_continue( > >>>> + const struct landlock_ruleset *const domain, > >>>> + const struct path *const path, const u32 access_request, > >>>> + u64 *const layer_mask) > >>>> +{ > >>> [...] > >>>> + /* > >>>> + * An access is granted if, for each policy layer, at least one rule > >>>> + * encountered on the pathwalk grants the access, regardless of their > >>>> + * position in the layer stack. We must then check not-yet-seen layers > >>>> + * for each inode, from the last one added to the first one. > >>>> + */ > >>>> + for (i = 0; i < rule->num_layers; i++) { > >>>> + const struct landlock_layer *const layer = &rule->layers[i]; > >>>> + const u64 layer_level = BIT_ULL(layer->level - 1); > >>>> + > >>>> + if (!(layer_level & *layer_mask)) > >>>> + continue; > >>>> + if ((layer->access & access_request) != access_request) > >>>> + return false; > >>>> + *layer_mask &= ~layer_level; > >>> > >>> Hmm... shouldn't the last 5 lines be replaced by the following? > >>> > >>> if ((layer->access & access_request) == access_request) > >>> *layer_mask &= ~layer_level; > >>> > >>> And then, since this function would always return true, you could > >>> change its return type to "void". > >>> > >>> > >>> As far as I can tell, the current version will still, if a ruleset > >>> looks like this: > >>> > >>> /usr read+write > >>> /usr/lib/ read > >>> > >>> reject write access to /usr/lib, right? > >> > >> If these two rules are from different layers, then yes it would work as > >> intended. However, if these rules are from the same layer the path walk > >> will not stop at /usr/lib but go down to /usr, which grants write > >> access. > > > > I don't see why the code would do what you're saying it does. And an > > experiment seems to confirm what I said; I checked out landlock-v26, > > and the behavior I get is: > > There is a misunderstanding, I was responding to your proposition to > modify check_access_path_continue(), not about the behavior of landlock-v26. > > > > > user@vm:~/landlock$ dd if=/dev/null of=/tmp/aaa > > 0+0 records in > > 0+0 records out > > 0 bytes copied, 0.00106365 s, 0.0 kB/s > > user@vm:~/landlock$ LL_FS_RO='/lib' LL_FS_RW='/' ./sandboxer dd > > if=/dev/null of=/tmp/aaa > > 0+0 records in > > 0+0 records out > > 0 bytes copied, 0.000491814 s, 0.0 kB/s > > user@vm:~/landlock$ LL_FS_RO='/tmp' LL_FS_RW='/' ./sandboxer dd > > if=/dev/null of=/tmp/aaa > > dd: failed to open '/tmp/aaa': Permission denied > > user@vm:~/landlock$ > > > > Granting read access to /tmp prevents writing to it, even though write > > access was granted to /. > > > > It indeed works like this with landlock-v26. However, with your above > proposition, it would work like this: > > $ LL_FS_RO='/tmp' LL_FS_RW='/' ./sandboxer dd if=/dev/null of=/tmp/aaa > 0+0 records in > 0+0 records out > 0 bytes copied, 0.000187265 s, 0.0 kB/s > > …which is not what users would expect I guess. :) Ah, so we are disagreeing about what the right semantics are. ^^ To me, that is exactly the behavior I would expect. Imagine that someone wants to write a program that needs to be able to load libraries from /usr/lib (including subdirectories) and needs to be able to write output to some user-specified output directory. So they use something like this to sandbox their program (plus error handling): static void add_fs_rule(int ruleset_fd, char *path, u64 allowed_access) { int fd = open(path, O_PATH); struct landlock_path_beneath_attr path_beneath = { .parent_fd = fd, .allowed_access = allowed_access }; landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH, &path_beneath, 0); close(fd); } int main(int argc, char **argv) { char *output_dir = argv[1]; int ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr, 0); add_fs_rule(ruleset_fd, "/usr/lib", ACCESS_FS_ROUGHLY_READ); add_fs_rule(ruleset_fd, output_dir, LANDLOCK_ACCESS_FS_WRITE_FILE|LANDLOCK_ACCESS_FS_MAKE_REG|LANDLOCK_ACCESS_FS_REMOVE_FILE); prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); landlock_enforce_ruleset_current(ruleset_fd, 0); } This will *almost* always work; but if the output directory is /usr/lib/x86_64-linux-gnu/ , loading libraries from that directory won't work anymore, right? So if userspace wanted this to *always* works correctly, it would have to somehow figure out whether there is a path upwards from the output directory (under any mount) that will encounter /usr/lib, and set different permissions if that is the case. That seems unnecessarily messy to me; and I think that this will make it harder for generic commandline tools and such to adopt landlock. If you do want to have the ability to deny access to subtrees of trees to which access is permitted, I think that that should be made explicit in the UAPI - e.g. you could (at a later point, after this series has landed) introduce a new EXCLUDE flag for landlock_add_rule() that means "I want to deny the access specified by this rule", or something like that. (And you'd have to very carefully document under which circumstances such rules are actually effective - e.g. if someone grants full access to $HOME, but excludes $HOME/.ssh, an attacker would still be able to rename $HOME/.ssh to $HOME/old_ssh, and then if the program is later restarted and creates the ruleset from scratch again, the old SSH folder will be accessible.)
On 15/01/2021 19:31, Jann Horn wrote: > On Fri, Jan 15, 2021 at 10:10 AM Mickaël Salaün <mic@digikod.net> wrote: >> On 14/01/2021 23:43, Jann Horn wrote: >>> On Thu, Jan 14, 2021 at 7:54 PM Mickaël Salaün <mic@digikod.net> wrote: >>>> On 14/01/2021 04:22, Jann Horn wrote: >>>>> On Wed, Dec 9, 2020 at 8:28 PM Mickaël Salaün <mic@digikod.net> wrote: >>>>>> Thanks to the Landlock objects and ruleset, it is possible to identify >>>>>> inodes according to a process's domain. To enable an unprivileged >>>>>> process to express a file hierarchy, it first needs to open a directory >>>>>> (or a file) and pass this file descriptor to the kernel through >>>>>> landlock_add_rule(2). When checking if a file access request is >>>>>> allowed, we walk from the requested dentry to the real root, following >>>>>> the different mount layers. The access to each "tagged" inodes are >>>>>> collected according to their rule layer level, and ANDed to create >>>>>> access to the requested file hierarchy. This makes possible to identify >>>>>> a lot of files without tagging every inodes nor modifying the >>>>>> filesystem, while still following the view and understanding the user >>>>>> has from the filesystem. >>>>>> >>>>>> Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not >>>>>> keep the same struct inodes for the same inodes whereas these inodes are >>>>>> in use. >>>>>> >>>>>> This commit adds a minimal set of supported filesystem access-control >>>>>> which doesn't enable to restrict all file-related actions. This is the >>>>>> result of multiple discussions to minimize the code of Landlock to ease >>>>>> review. Thanks to the Landlock design, extending this access-control >>>>>> without breaking user space will not be a problem. Moreover, seccomp >>>>>> filters can be used to restrict the use of syscall families which may >>>>>> not be currently handled by Landlock. >>>>> [...] >>>>>> +static bool check_access_path_continue( >>>>>> + const struct landlock_ruleset *const domain, >>>>>> + const struct path *const path, const u32 access_request, >>>>>> + u64 *const layer_mask) >>>>>> +{ >>>>> [...] >>>>>> + /* >>>>>> + * An access is granted if, for each policy layer, at least one rule >>>>>> + * encountered on the pathwalk grants the access, regardless of their >>>>>> + * position in the layer stack. We must then check not-yet-seen layers >>>>>> + * for each inode, from the last one added to the first one. >>>>>> + */ >>>>>> + for (i = 0; i < rule->num_layers; i++) { >>>>>> + const struct landlock_layer *const layer = &rule->layers[i]; >>>>>> + const u64 layer_level = BIT_ULL(layer->level - 1); >>>>>> + >>>>>> + if (!(layer_level & *layer_mask)) >>>>>> + continue; >>>>>> + if ((layer->access & access_request) != access_request) >>>>>> + return false; >>>>>> + *layer_mask &= ~layer_level; >>>>> >>>>> Hmm... shouldn't the last 5 lines be replaced by the following? >>>>> >>>>> if ((layer->access & access_request) == access_request) >>>>> *layer_mask &= ~layer_level; >>>>> >>>>> And then, since this function would always return true, you could >>>>> change its return type to "void". >>>>> >>>>> >>>>> As far as I can tell, the current version will still, if a ruleset >>>>> looks like this: >>>>> >>>>> /usr read+write >>>>> /usr/lib/ read >>>>> >>>>> reject write access to /usr/lib, right? >>>> >>>> If these two rules are from different layers, then yes it would work as >>>> intended. However, if these rules are from the same layer the path walk >>>> will not stop at /usr/lib but go down to /usr, which grants write >>>> access. >>> >>> I don't see why the code would do what you're saying it does. And an >>> experiment seems to confirm what I said; I checked out landlock-v26, >>> and the behavior I get is: >> >> There is a misunderstanding, I was responding to your proposition to >> modify check_access_path_continue(), not about the behavior of landlock-v26. >> >>> >>> user@vm:~/landlock$ dd if=/dev/null of=/tmp/aaa >>> 0+0 records in >>> 0+0 records out >>> 0 bytes copied, 0.00106365 s, 0.0 kB/s >>> user@vm:~/landlock$ LL_FS_RO='/lib' LL_FS_RW='/' ./sandboxer dd >>> if=/dev/null of=/tmp/aaa >>> 0+0 records in >>> 0+0 records out >>> 0 bytes copied, 0.000491814 s, 0.0 kB/s >>> user@vm:~/landlock$ LL_FS_RO='/tmp' LL_FS_RW='/' ./sandboxer dd >>> if=/dev/null of=/tmp/aaa >>> dd: failed to open '/tmp/aaa': Permission denied >>> user@vm:~/landlock$ >>> >>> Granting read access to /tmp prevents writing to it, even though write >>> access was granted to /. >>> >> >> It indeed works like this with landlock-v26. However, with your above >> proposition, it would work like this: >> >> $ LL_FS_RO='/tmp' LL_FS_RW='/' ./sandboxer dd if=/dev/null of=/tmp/aaa >> 0+0 records in >> 0+0 records out >> 0 bytes copied, 0.000187265 s, 0.0 kB/s >> >> …which is not what users would expect I guess. :) > > Ah, so we are disagreeing about what the right semantics are. ^^ To > me, that is exactly the behavior I would expect. > > Imagine that someone wants to write a program that needs to be able to > load libraries from /usr/lib (including subdirectories) and needs to > be able to write output to some user-specified output directory. So > they use something like this to sandbox their program (plus error > handling): > > static void add_fs_rule(int ruleset_fd, char *path, u64 allowed_access) { > int fd = open(path, O_PATH); > struct landlock_path_beneath_attr path_beneath = { > .parent_fd = fd, > .allowed_access = allowed_access > }; > landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH, > &path_beneath, 0); > close(fd); > } > int main(int argc, char **argv) { > char *output_dir = argv[1]; > int ruleset_fd = landlock_create_ruleset(&ruleset_attr, > sizeof(ruleset_attr, 0); > add_fs_rule(ruleset_fd, "/usr/lib", ACCESS_FS_ROUGHLY_READ); > add_fs_rule(ruleset_fd, output_dir, > LANDLOCK_ACCESS_FS_WRITE_FILE|LANDLOCK_ACCESS_FS_MAKE_REG|LANDLOCK_ACCESS_FS_REMOVE_FILE); > prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); > landlock_enforce_ruleset_current(ruleset_fd, 0); > } > > This will *almost* always work; but if the output directory is > /usr/lib/x86_64-linux-gnu/ , loading libraries from that directory > won't work anymore, right? So if userspace wanted this to *always* > works correctly, it would have to somehow figure out whether there is > a path upwards from the output directory (under any mount) that will > encounter /usr/lib, and set different permissions if that is the case. > That seems unnecessarily messy to me; and I think that this will make > it harder for generic commandline tools and such to adopt landlock. > > > If you do want to have the ability to deny access to subtrees of trees > to which access is permitted, I think that that should be made > explicit in the UAPI - e.g. you could (at a later point, after this > series has landed) introduce a new EXCLUDE flag for > landlock_add_rule() that means "I want to deny the access specified by > this rule", or something like that. (And you'd have to very carefully > document under which circumstances such rules are actually effective - > e.g. if someone grants full access to $HOME, but excludes $HOME/.ssh, > an attacker would still be able to rename $HOME/.ssh to $HOME/old_ssh, > and then if the program is later restarted and creates the ruleset > from scratch again, the old SSH folder will be accessible.) > OK, it's indeed a more pragmatic approach. I'll take your change and merge check_access_path_continue() with check_access_path(). Thanks!
diff --git a/MAINTAINERS b/MAINTAINERS index dc718573317e..8656d3b9dd0e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9833,6 +9833,7 @@ L: linux-security-module@vger.kernel.org S: Supported W: https://landlock.io T: git https://github.com/landlock-lsm/linux.git +F: include/uapi/linux/landlock.h F: security/landlock/ K: landlock K: LANDLOCK diff --git a/arch/Kconfig b/arch/Kconfig index ba4e966484ab..e1f8180521fb 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -884,6 +884,13 @@ config COMPAT_32BIT_TIME config ARCH_NO_PREEMPT bool +config ARCH_EPHEMERAL_INODES + def_bool n + help + An arch should select this symbol if it doesn't keep track of inode + instances on its own, but instead relies on something else (e.g. the host + kernel for an UML kernel). + config ARCH_SUPPORTS_RT bool diff --git a/arch/um/Kconfig b/arch/um/Kconfig index 4b799fad8b48..082d0207a7be 100644 --- a/arch/um/Kconfig +++ b/arch/um/Kconfig @@ -5,6 +5,7 @@ menu "UML-specific options" config UML bool default y + select ARCH_EPHEMERAL_INODES select ARCH_HAS_KCOV select ARCH_NO_PREEMPT select HAVE_ARCH_AUDITSYSCALL diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h new file mode 100644 index 000000000000..d547bd49fe38 --- /dev/null +++ b/include/uapi/linux/landlock.h @@ -0,0 +1,75 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* + * Landlock - User space API + * + * Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> + * Copyright © 2018-2020 ANSSI + */ + +#ifndef _UAPI__LINUX_LANDLOCK_H__ +#define _UAPI__LINUX_LANDLOCK_H__ + +/** + * DOC: fs_access + * + * A set of actions on kernel objects may be defined by an attribute (e.g. + * &struct landlock_path_beneath_attr) including a bitmask of access. + * + * Filesystem flags + * ~~~~~~~~~~~~~~~~ + * + * These flags enable to restrict a sandboxed process to a set of actions on + * files and directories. Files or directories opened before the sandboxing + * are not subject to these restrictions. + * + * A file can only receive these access rights: + * + * - %LANDLOCK_ACCESS_FS_EXECUTE: Execute a file. + * - %LANDLOCK_ACCESS_FS_WRITE_FILE: Open a file with write access. + * - %LANDLOCK_ACCESS_FS_READ_FILE: Open a file with read access. + * + * A directory can receive access rights related to files or directories. The + * following access right is applied to the directory itself, and the + * directories beneath it: + * + * - %LANDLOCK_ACCESS_FS_READ_DIR: Open a directory or list its content. + * + * However, the following access rights only apply to the content of a + * directory, not the directory itself: + * + * - %LANDLOCK_ACCESS_FS_REMOVE_DIR: Remove an empty directory or rename one. + * - %LANDLOCK_ACCESS_FS_REMOVE_FILE: Unlink (or rename) a file. + * - %LANDLOCK_ACCESS_FS_MAKE_CHAR: Create (or rename or link) a character + * device. + * - %LANDLOCK_ACCESS_FS_MAKE_DIR: Create (or rename) a directory. + * - %LANDLOCK_ACCESS_FS_MAKE_REG: Create (or rename or link) a regular file. + * - %LANDLOCK_ACCESS_FS_MAKE_SOCK: Create (or rename or link) a UNIX domain + * socket. + * - %LANDLOCK_ACCESS_FS_MAKE_FIFO: Create (or rename or link) a named pipe. + * - %LANDLOCK_ACCESS_FS_MAKE_BLOCK: Create (or rename or link) a block device. + * - %LANDLOCK_ACCESS_FS_MAKE_SYM: Create (or rename or link) a symbolic link. + * + * .. warning:: + * + * It is currently not possible to restrict some file-related actions + * accessible through these syscall families: :manpage:`chdir(2)`, + * :manpage:`truncate(2)`, :manpage:`stat(2)`, :manpage:`flock(2)`, + * :manpage:`chmod(2)`, :manpage:`chown(2)`, :manpage:`setxattr(2)`, + * :manpage:`ioctl(2)`, :manpage:`fcntl(2)`. + * Future Landlock evolutions will enable to restrict them. + */ +#define LANDLOCK_ACCESS_FS_EXECUTE (1ULL << 0) +#define LANDLOCK_ACCESS_FS_WRITE_FILE (1ULL << 1) +#define LANDLOCK_ACCESS_FS_READ_FILE (1ULL << 2) +#define LANDLOCK_ACCESS_FS_READ_DIR (1ULL << 3) +#define LANDLOCK_ACCESS_FS_REMOVE_DIR (1ULL << 4) +#define LANDLOCK_ACCESS_FS_REMOVE_FILE (1ULL << 5) +#define LANDLOCK_ACCESS_FS_MAKE_CHAR (1ULL << 6) +#define LANDLOCK_ACCESS_FS_MAKE_DIR (1ULL << 7) +#define LANDLOCK_ACCESS_FS_MAKE_REG (1ULL << 8) +#define LANDLOCK_ACCESS_FS_MAKE_SOCK (1ULL << 9) +#define LANDLOCK_ACCESS_FS_MAKE_FIFO (1ULL << 10) +#define LANDLOCK_ACCESS_FS_MAKE_BLOCK (1ULL << 11) +#define LANDLOCK_ACCESS_FS_MAKE_SYM (1ULL << 12) + +#endif /* _UAPI__LINUX_LANDLOCK_H__ */ diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig index ea58e6208afa..43e5b0bb0706 100644 --- a/security/landlock/Kconfig +++ b/security/landlock/Kconfig @@ -2,7 +2,7 @@ config SECURITY_LANDLOCK bool "Landlock support" - depends on SECURITY + depends on SECURITY && !ARCH_EPHEMERAL_INODES select SECURITY_PATH help Landlock is a safe sandboxing mechanism which enables processes to diff --git a/security/landlock/Makefile b/security/landlock/Makefile index f1d1eb72fa76..92e3d80ab8ed 100644 --- a/security/landlock/Makefile +++ b/security/landlock/Makefile @@ -1,4 +1,4 @@ obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o landlock-y := setup.o object.o ruleset.o \ - cred.o ptrace.o + cred.o ptrace.o fs.o diff --git a/security/landlock/fs.c b/security/landlock/fs.c new file mode 100644 index 000000000000..cd80b1973bb5 --- /dev/null +++ b/security/landlock/fs.c @@ -0,0 +1,622 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Landlock LSM - Filesystem management and hooks + * + * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net> + * Copyright © 2018-2020 ANSSI + */ + +#include <linux/atomic.h> +#include <linux/bitops.h> +#include <linux/bits.h> +#include <linux/compiler_types.h> +#include <linux/dcache.h> +#include <linux/err.h> +#include <linux/fs.h> +#include <linux/init.h> +#include <linux/kernel.h> +#include <linux/limits.h> +#include <linux/list.h> +#include <linux/lsm_hooks.h> +#include <linux/mount.h> +#include <linux/namei.h> +#include <linux/path.h> +#include <linux/rcupdate.h> +#include <linux/spinlock.h> +#include <linux/stat.h> +#include <linux/types.h> +#include <linux/wait_bit.h> +#include <linux/workqueue.h> +#include <uapi/linux/landlock.h> + +#include "common.h" +#include "cred.h" +#include "fs.h" +#include "limits.h" +#include "object.h" +#include "ruleset.h" +#include "setup.h" + +/* Underlying object management */ + +static void release_inode(struct landlock_object *const object) + __releases(object->lock) +{ + struct inode *const inode = object->underobj; + struct super_block *sb; + + if (!inode) { + spin_unlock(&object->lock); + return; + } + + spin_lock(&inode->i_lock); + /* + * Make sure that if the filesystem is concurrently unmounted, + * hook_sb_delete() will wait for us to finish iput(). + */ + sb = inode->i_sb; + atomic_long_inc(&landlock_superblock(sb)->inode_refs); + rcu_assign_pointer(landlock_inode(inode)->object, NULL); + spin_unlock(&inode->i_lock); + spin_unlock(&object->lock); + /* + * Now, new rules can safely be tied to @inode. + */ + + iput(inode); + if (atomic_long_dec_and_test(&landlock_superblock(sb)->inode_refs)) + wake_up_var(&landlock_superblock(sb)->inode_refs); +} + +static const struct landlock_object_underops landlock_fs_underops = { + .release = release_inode +}; + +/* Ruleset management */ + +static struct landlock_object *get_inode_object(struct inode *const inode) +{ + struct landlock_object *object, *new_object; + struct landlock_inode_security *inode_sec = landlock_inode(inode); + + rcu_read_lock(); +retry: + object = rcu_dereference(inode_sec->object); + if (object) { + if (likely(refcount_inc_not_zero(&object->usage))) { + rcu_read_unlock(); + return object; + } + /* + * We are racing with release_inode(), the object is going + * away. Wait for release_inode(), then retry. + */ + spin_lock(&object->lock); + spin_unlock(&object->lock); + goto retry; + } + rcu_read_unlock(); + + /* + * If there is no object tied to @inode, then create a new one (without + * holding any locks). + */ + new_object = landlock_create_object(&landlock_fs_underops, inode); + if (IS_ERR(new_object)) + return new_object; + + spin_lock(&inode->i_lock); + object = rcu_dereference_protected(inode_sec->object, + lockdep_is_held(&inode->i_lock)); + if (unlikely(object)) { + /* Someone else just created the object, bail out and retry. */ + spin_unlock(&inode->i_lock); + kfree(new_object); + + rcu_read_lock(); + goto retry; + } + + rcu_assign_pointer(inode_sec->object, new_object); + /* + * @inode will be released by hook_sb_delete() on its superblock + * shutdown. + */ + ihold(inode); + spin_unlock(&inode->i_lock); + return new_object; +} + +/* All access rights which can be tied to files. */ +#define ACCESS_FILE ( \ + LANDLOCK_ACCESS_FS_EXECUTE | \ + LANDLOCK_ACCESS_FS_WRITE_FILE | \ + LANDLOCK_ACCESS_FS_READ_FILE) + +/* + * @path: Should have been checked by get_path_from_fd(). + */ +int landlock_append_fs_rule(struct landlock_ruleset *const ruleset, + const struct path *const path, u32 access_rights) +{ + int err; + struct landlock_object *object; + + /* Files only get access rights that make sense. */ + if (!d_is_dir(path->dentry) && (access_rights | ACCESS_FILE) != + ACCESS_FILE) + return -EINVAL; + + /* Transforms relative access rights to absolute ones. */ + access_rights |= LANDLOCK_MASK_ACCESS_FS & ~ruleset->fs_access_mask; + object = get_inode_object(d_backing_inode(path->dentry)); + if (IS_ERR(object)) + return PTR_ERR(object); + mutex_lock(&ruleset->lock); + err = landlock_insert_rule(ruleset, object, access_rights); + mutex_unlock(&ruleset->lock); + /* + * No need to check for an error because landlock_insert_rule() + * increments the refcount for the new object if needed. + */ + landlock_put_object(object); + return err; +} + +/* Access-control management */ + +static bool check_access_path_continue( + const struct landlock_ruleset *const domain, + const struct path *const path, const u32 access_request, + u64 *const layer_mask) +{ + const struct landlock_rule *rule; + const struct inode *inode; + size_t i; + + if (d_is_negative(path->dentry)) + /* Continues to walk while there is no mapped inode. */ + return true; + inode = d_backing_inode(path->dentry); + rcu_read_lock(); + rule = landlock_find_rule(domain, + rcu_dereference(landlock_inode(inode)->object)); + rcu_read_unlock(); + if (!rule) + return true; + + /* + * An access is granted if, for each policy layer, at least one rule + * encountered on the pathwalk grants the access, regardless of their + * position in the layer stack. We must then check not-yet-seen layers + * for each inode, from the last one added to the first one. + */ + for (i = 0; i < rule->num_layers; i++) { + const struct landlock_layer *const layer = &rule->layers[i]; + const u64 layer_level = BIT_ULL(layer->level - 1); + + if (!(layer_level & *layer_mask)) + continue; + if ((layer->access & access_request) != access_request) + return false; + *layer_mask &= ~layer_level; + } + return true; +} + +static int check_access_path(const struct landlock_ruleset *const domain, + const struct path *const path, u32 access_request) +{ + bool allowed = false; + struct path walker_path; + u64 layer_mask; + + /* Make sure all layers can be checked. */ + BUILD_BUG_ON(BITS_PER_TYPE(layer_mask) < LANDLOCK_MAX_NUM_LAYERS); + + if (WARN_ON_ONCE(!domain || !path)) + return 0; + /* + * Allows access to pseudo filesystems that will never be mountable + * (e.g. sockfs, pipefs), but can still be reachable through + * /proc/self/fd . + */ + if ((path->dentry->d_sb->s_flags & SB_NOUSER) || + (d_is_positive(path->dentry) && + unlikely(IS_PRIVATE(d_backing_inode(path->dentry))))) + return 0; + if (WARN_ON_ONCE(domain->num_layers < 1)) + return -EACCES; + + layer_mask = GENMASK_ULL(domain->num_layers - 1, 0); + /* + * An access request which is not handled by the domain should be + * allowed. + */ + access_request &= domain->fs_access_mask; + if (access_request == 0) + return 0; + walker_path = *path; + path_get(&walker_path); + /* + * We need to walk through all the hierarchy to not miss any relevant + * restriction. + */ + while (check_access_path_continue(domain, &walker_path, access_request, + &layer_mask)) { + struct dentry *parent_dentry; + + /* Stops when a rule from each layer granted access. */ + if (layer_mask == 0) { + allowed = true; + break; + } + +jump_up: + /* + * Does not work with orphaned/private mounts like overlayfs + * layers for now (cf. ovl_path_real() and ovl_path_open()). + */ + if (walker_path.dentry == walker_path.mnt->mnt_root) { + if (follow_up(&walker_path)) { + /* Ignores hidden mount points. */ + goto jump_up; + } else { + /* + * Stops at the real root. Denies access + * because not all layers have granted access. + */ + allowed = false; + break; + } + } + if (unlikely(IS_ROOT(walker_path.dentry))) { + /* + * Stops at disconnected root directories. Only allows + * access to internal filesystems (e.g. nsfs which is + * reachable through /proc/self/ns). + */ + allowed = !!(walker_path.mnt->mnt_flags & MNT_INTERNAL); + break; + } + parent_dentry = dget_parent(walker_path.dentry); + dput(walker_path.dentry); + walker_path.dentry = parent_dentry; + } + path_put(&walker_path); + return allowed ? 0 : -EACCES; +} + +static inline int current_check_access_path(const struct path *const path, + const u32 access_request) +{ + const struct landlock_ruleset *const dom = + landlock_get_current_domain(); + + if (!dom) + return 0; + return check_access_path(dom, path, access_request); +} + +/* Super-block hooks */ + +/* + * Release the inodes used in a security policy. + * + * Cf. fsnotify_unmount_inodes() + */ +static void hook_sb_delete(struct super_block *const sb) +{ + struct inode *inode, *iput_inode = NULL; + + if (!landlock_initialized) + return; + + spin_lock(&sb->s_inode_list_lock); + list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { + struct landlock_inode_security *inode_sec = + landlock_inode(inode); + struct landlock_object *object; + bool do_put = false; + + rcu_read_lock(); + object = rcu_dereference(inode_sec->object); + if (!object) { + rcu_read_unlock(); + continue; + } + + spin_lock(&object->lock); + if (object->underobj) { + object->underobj = NULL; + do_put = true; + spin_lock(&inode->i_lock); + rcu_assign_pointer(inode_sec->object, NULL); + spin_unlock(&inode->i_lock); + } + spin_unlock(&object->lock); + rcu_read_unlock(); + if (!do_put) + /* + * A concurrent iput() in release_inode() is ongoing + * and we will just wait for it to finish. + */ + continue; + + /* + * At this point, we own the ihold() reference that was + * originally set up by get_inode_object(). Therefore we can + * drop the list lock and know that the inode won't disappear + * from under us until the next loop walk. + */ + spin_unlock(&sb->s_inode_list_lock); + /* + * We can now actually put the previous inode, which is not + * needed anymore for the loop walk. + */ + if (iput_inode) + iput(iput_inode); + iput_inode = inode; + spin_lock(&sb->s_inode_list_lock); + } + spin_unlock(&sb->s_inode_list_lock); + if (iput_inode) + iput(iput_inode); + + /* + * Wait for pending iput() in release_inode(). + */ + wait_var_event(&landlock_superblock(sb)->inode_refs, !atomic_long_read( + &landlock_superblock(sb)->inode_refs)); +} + +/* + * Because a Landlock security policy is defined according to the filesystem + * layout (i.e. the mount namespace), changing it may grant access to files not + * previously allowed. + * + * To make it simple, deny any filesystem layout modification by landlocked + * processes. Non-landlocked processes may still change the namespace of a + * landlocked process, but this kind of threat must be handled by a system-wide + * access-control security policy. + * + * This could be lifted in the future if Landlock can safely handle mount + * namespace updates requested by a landlocked process. Indeed, we could + * update the current domain (which is currently read-only) by taking into + * account the accesses of the source and the destination of a new mount point. + * However, it would also require to make all the child domains dynamically + * inherit these new constraints. Anyway, for backward compatibility reasons, + * a dedicated user space option would be required (e.g. as a ruleset command + * option). + */ +static int hook_sb_mount(const char *const dev_name, + const struct path *const path, const char *const type, + const unsigned long flags, void *const data) +{ + if (!landlock_get_current_domain()) + return 0; + return -EPERM; +} + +static int hook_move_mount(const struct path *const from_path, + const struct path *const to_path) +{ + if (!landlock_get_current_domain()) + return 0; + return -EPERM; +} + +/* + * Removing a mount point may reveal a previously hidden file hierarchy, which + * may then grant access to files, which may have previously been forbidden. + */ +static int hook_sb_umount(struct vfsmount *const mnt, const int flags) +{ + if (!landlock_get_current_domain()) + return 0; + return -EPERM; +} + +static int hook_sb_remount(struct super_block *const sb, void *const mnt_opts) +{ + if (!landlock_get_current_domain()) + return 0; + return -EPERM; +} + +/* + * pivot_root(2), like mount(2), changes the current mount namespace. It must + * then be forbidden for a landlocked process. + * + * However, chroot(2) may be allowed because it only changes the relative root + * directory of the current process. Moreover, it can be used to restrict the + * view of the filesystem. + */ +static int hook_sb_pivotroot(const struct path *const old_path, + const struct path *const new_path) +{ + if (!landlock_get_current_domain()) + return 0; + return -EPERM; +} + +/* Path hooks */ + +static inline u32 get_mode_access(const umode_t mode) +{ + switch (mode & S_IFMT) { + case S_IFLNK: + return LANDLOCK_ACCESS_FS_MAKE_SYM; + case 0: + /* A zero mode translates to S_IFREG. */ + case S_IFREG: + return LANDLOCK_ACCESS_FS_MAKE_REG; + case S_IFDIR: + return LANDLOCK_ACCESS_FS_MAKE_DIR; + case S_IFCHR: + return LANDLOCK_ACCESS_FS_MAKE_CHAR; + case S_IFBLK: + return LANDLOCK_ACCESS_FS_MAKE_BLOCK; + case S_IFIFO: + return LANDLOCK_ACCESS_FS_MAKE_FIFO; + case S_IFSOCK: + return LANDLOCK_ACCESS_FS_MAKE_SOCK; + default: + WARN_ON_ONCE(1); + return 0; + } +} + +/* + * Creating multiple links or renaming may lead to privilege escalations if not + * handled properly. Indeed, we must be sure that the source doesn't gain more + * privileges by being accessible from the destination. This is getting more + * complex when dealing with multiple layers. The whole picture can be seen as + * a multilayer partial ordering problem. A future version of Landlock will + * deal with that. + */ +static int hook_path_link(struct dentry *const old_dentry, + const struct path *const new_dir, + struct dentry *const new_dentry) +{ + const struct landlock_ruleset *const dom = + landlock_get_current_domain(); + + if (!dom) + return 0; + /* The mount points are the same for old and new paths, cf. EXDEV. */ + if (old_dentry->d_parent != new_dir->dentry) + /* For now, forbid reparenting. */ + return -EACCES; + if (unlikely(d_is_negative(old_dentry))) + return -EACCES; + return check_access_path(dom, new_dir, + get_mode_access(d_backing_inode(old_dentry)->i_mode)); +} + +static inline u32 maybe_remove(const struct dentry *const dentry) +{ + if (d_is_negative(dentry)) + return 0; + return d_is_dir(dentry) ? LANDLOCK_ACCESS_FS_REMOVE_DIR : + LANDLOCK_ACCESS_FS_REMOVE_FILE; +} + +static int hook_path_rename(const struct path *const old_dir, + struct dentry *const old_dentry, + const struct path *const new_dir, + struct dentry *const new_dentry) +{ + const struct landlock_ruleset *const dom = + landlock_get_current_domain(); + + if (!dom) + return 0; + /* The mount points are the same for old and new paths, cf. EXDEV. */ + if (old_dir->dentry != new_dir->dentry) + /* For now, forbid reparenting. */ + return -EACCES; + if (WARN_ON_ONCE(d_is_negative(old_dentry))) + return -EACCES; + /* RENAME_EXCHANGE is handled because directories are the same. */ + return check_access_path(dom, old_dir, maybe_remove(old_dentry) | + maybe_remove(new_dentry) | + get_mode_access(d_backing_inode(old_dentry)->i_mode)); +} + +static int hook_path_mkdir(const struct path *const dir, + struct dentry *const dentry, const umode_t mode) +{ + return current_check_access_path(dir, LANDLOCK_ACCESS_FS_MAKE_DIR); +} + +static int hook_path_mknod(const struct path *const dir, + struct dentry *const dentry, const umode_t mode, + const unsigned int dev) +{ + const struct landlock_ruleset *const dom = + landlock_get_current_domain(); + + if (!dom) + return 0; + return check_access_path(dom, dir, get_mode_access(mode)); +} + +static int hook_path_symlink(const struct path *const dir, + struct dentry *const dentry, const char *const old_name) +{ + return current_check_access_path(dir, LANDLOCK_ACCESS_FS_MAKE_SYM); +} + +static int hook_path_unlink(const struct path *const dir, + struct dentry *const dentry) +{ + return current_check_access_path(dir, LANDLOCK_ACCESS_FS_REMOVE_FILE); +} + +static int hook_path_rmdir(const struct path *const dir, + struct dentry *const dentry) +{ + return current_check_access_path(dir, LANDLOCK_ACCESS_FS_REMOVE_DIR); +} + +/* File hooks */ + +static inline u32 get_file_access(const struct file *const file) +{ + u32 access = 0; + + if (file->f_mode & FMODE_READ) { + /* A directory can only be opened in read mode. */ + if (S_ISDIR(file_inode(file)->i_mode)) + return LANDLOCK_ACCESS_FS_READ_DIR; + access = LANDLOCK_ACCESS_FS_READ_FILE; + } + if (file->f_mode & FMODE_WRITE) + access |= LANDLOCK_ACCESS_FS_WRITE_FILE; + /* __FMODE_EXEC is indeed part of f_flags, not f_mode. */ + if (file->f_flags & __FMODE_EXEC) + access |= LANDLOCK_ACCESS_FS_EXECUTE; + return access; +} + +static int hook_file_open(struct file *const file) +{ + const struct landlock_ruleset *const dom = + landlock_get_current_domain(); + + if (!dom) + return 0; + /* + * Because a file may be opened with O_PATH, get_file_access() may + * return 0. This case will be handled with a future Landlock + * evolution. + */ + return current_check_access_path(&file->f_path, get_file_access(file)); +} + +static struct security_hook_list landlock_hooks[] __lsm_ro_after_init = { + LSM_HOOK_INIT(sb_delete, hook_sb_delete), + LSM_HOOK_INIT(sb_mount, hook_sb_mount), + LSM_HOOK_INIT(move_mount, hook_move_mount), + LSM_HOOK_INIT(sb_umount, hook_sb_umount), + LSM_HOOK_INIT(sb_remount, hook_sb_remount), + LSM_HOOK_INIT(sb_pivotroot, hook_sb_pivotroot), + + LSM_HOOK_INIT(path_link, hook_path_link), + LSM_HOOK_INIT(path_rename, hook_path_rename), + LSM_HOOK_INIT(path_mkdir, hook_path_mkdir), + LSM_HOOK_INIT(path_mknod, hook_path_mknod), + LSM_HOOK_INIT(path_symlink, hook_path_symlink), + LSM_HOOK_INIT(path_unlink, hook_path_unlink), + LSM_HOOK_INIT(path_rmdir, hook_path_rmdir), + + LSM_HOOK_INIT(file_open, hook_file_open), +}; + +__init void landlock_add_fs_hooks(void) +{ + security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks), + LANDLOCK_NAME); +} diff --git a/security/landlock/fs.h b/security/landlock/fs.h new file mode 100644 index 000000000000..9f14ec4d8d48 --- /dev/null +++ b/security/landlock/fs.h @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Landlock LSM - Filesystem management and hooks + * + * Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> + * Copyright © 2018-2020 ANSSI + */ + +#ifndef _SECURITY_LANDLOCK_FS_H +#define _SECURITY_LANDLOCK_FS_H + +#include <linux/fs.h> +#include <linux/init.h> +#include <linux/rcupdate.h> + +#include "ruleset.h" +#include "setup.h" + +struct landlock_inode_security { + /* + * @object: Weak pointer to an allocated object. All writes (i.e. + * creating a new object or removing one) are protected by the + * underlying inode->i_lock. Disassociating @object from the inode is + * additionally protected by @object->lock, from the time @object's + * usage refcount drops to zero to the time this pointer is nulled out. + * Cf. release_inode(). + */ + struct landlock_object __rcu *object; +}; + +struct landlock_superblock_security { + /* + * @inode_refs: References to Landlock underlying objects. + * Cf. struct super_block->s_fsnotify_inode_refs . + */ + atomic_long_t inode_refs; +}; + +static inline struct landlock_inode_security *landlock_inode( + const struct inode *const inode) +{ + return inode->i_security + landlock_blob_sizes.lbs_inode; +} + +static inline struct landlock_superblock_security *landlock_superblock( + const struct super_block *const superblock) +{ + return superblock->s_security + landlock_blob_sizes.lbs_superblock; +} + +__init void landlock_add_fs_hooks(void); + +int landlock_append_fs_rule(struct landlock_ruleset *const ruleset, + const struct path *const path, u32 access_hierarchy); + +#endif /* _SECURITY_LANDLOCK_FS_H */ diff --git a/security/landlock/limits.h b/security/landlock/limits.h index b734f597bb0e..2a0a1095ee27 100644 --- a/security/landlock/limits.h +++ b/security/landlock/limits.h @@ -10,8 +10,12 @@ #define _SECURITY_LANDLOCK_LIMITS_H #include <linux/limits.h> +#include <uapi/linux/landlock.h> #define LANDLOCK_MAX_NUM_LAYERS 64 #define LANDLOCK_MAX_NUM_RULES U32_MAX +#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM +#define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) + #endif /* _SECURITY_LANDLOCK_LIMITS_H */ diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c index bf7ff66c1b12..548636a68b48 100644 --- a/security/landlock/ruleset.c +++ b/security/landlock/ruleset.c @@ -112,10 +112,12 @@ static void build_check_ruleset(void) const struct landlock_ruleset ruleset = { .num_rules = ~0, .num_layers = ~0, + .fs_access_mask = ~0, }; BUILD_BUG_ON(ruleset.num_rules < LANDLOCK_MAX_NUM_RULES); BUILD_BUG_ON(ruleset.num_layers < LANDLOCK_MAX_NUM_LAYERS); + BUILD_BUG_ON(ruleset.fs_access_mask < LANDLOCK_MASK_ACCESS_FS); } /** @@ -214,9 +216,11 @@ static void build_check_layer(void) { const struct landlock_layer layer = { .level = ~0, + .access = ~0, }; BUILD_BUG_ON(layer.level < LANDLOCK_MAX_NUM_LAYERS); + BUILD_BUG_ON(layer.access < LANDLOCK_MASK_ACCESS_FS); } int landlock_insert_rule(struct landlock_ruleset *const ruleset, diff --git a/security/landlock/setup.c b/security/landlock/setup.c index a5d6ef334991..f8e8e980454c 100644 --- a/security/landlock/setup.c +++ b/security/landlock/setup.c @@ -11,17 +11,24 @@ #include "common.h" #include "cred.h" +#include "fs.h" #include "ptrace.h" #include "setup.h" +bool landlock_initialized __lsm_ro_after_init = false; + struct lsm_blob_sizes landlock_blob_sizes __lsm_ro_after_init = { .lbs_cred = sizeof(struct landlock_cred_security), + .lbs_inode = sizeof(struct landlock_inode_security), + .lbs_superblock = sizeof(struct landlock_superblock_security), }; static int __init landlock_init(void) { landlock_add_cred_hooks(); landlock_add_ptrace_hooks(); + landlock_add_fs_hooks(); + landlock_initialized = true; pr_info("Up and running.\n"); return 0; } diff --git a/security/landlock/setup.h b/security/landlock/setup.h index 9fdbf33fcc33..1daffab1ab4b 100644 --- a/security/landlock/setup.h +++ b/security/landlock/setup.h @@ -11,6 +11,8 @@ #include <linux/lsm_hooks.h> +extern bool landlock_initialized; + extern struct lsm_blob_sizes landlock_blob_sizes; #endif /* _SECURITY_LANDLOCK_SETUP_H */