diff mbox

arm64: mark kernel text segment as MEMBLOCK_NOMAP

Message ID 1455528512-32491-1-git-send-email-ard.biesheuvel@linaro.org
State New
Headers show

Commit Message

Ard Biesheuvel Feb. 15, 2016, 9:28 a.m. UTC
Commit 752af28bd711 ("arm64: move kernel image to base of vmalloc area")
moves the mapping of the kernel text and data segments out of the linear
region, and takes care not to create a writable alias of the read-only
kernel text segment by checking each memblock against overlap when the
memblocks are mapped into the linear mapping.

However, it is more correct, and much simpler, to mark the [_stext, _etext]
interval as MEMBLOCK_NOMAP. This will also prevent the interval from being
omitted from the linear region, but this fact will now also be reflected
in the output of pfn_valid(), and so code that expects any pfn_valid()
page to be mapped and accessible (which is a reasonable assumption) does
not get surprised by the text segment being inaccessible via the linear
mapping.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

---

This should hopefully address the issue reported by James, but I suppose
more work is required on the hibernate side to ensure the unmapped text
region is preserved, since it is no longer covered by the linear mapping.

 arch/arm64/mm/init.c |  1 +
 arch/arm64/mm/mmu.c  | 39 ++------------------
 2 files changed, 4 insertions(+), 36 deletions(-)

-- 
2.5.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Comments

Catalin Marinas Feb. 15, 2016, 11:45 a.m. UTC | #1
On Mon, Feb 15, 2016 at 10:28:32AM +0100, Ard Biesheuvel wrote:
> Commit 752af28bd711 ("arm64: move kernel image to base of vmalloc area")

> moves the mapping of the kernel text and data segments out of the linear

> region, and takes care not to create a writable alias of the read-only

> kernel text segment by checking each memblock against overlap when the

> memblocks are mapped into the linear mapping.

> 

> However, it is more correct, and much simpler, to mark the [_stext, _etext]

> interval as MEMBLOCK_NOMAP. This will also prevent the interval from being

> omitted from the linear region, but this fact will now also be reflected

> in the output of pfn_valid(), and so code that expects any pfn_valid()

> page to be mapped and accessible (which is a reasonable assumption) does

> not get surprised by the text segment being inaccessible via the linear

> mapping.

> 

> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---

> 

> This should hopefully address the issue reported by James, but I suppose

> more work is required on the hibernate side to ensure the unmapped text

> region is preserved, since it is no longer covered by the linear mapping.


Ard, I'm about to drop the kernel memory map patches from -next. There
are several issues like KASAN (which may as well be a KASAN bug but I
haven't got to the bottom of it yet), initrd (you have patches but
require additional acks).

I'll keep your stuff on the for-next/kernmap branch and do further
debugging. If it stabilises in the next 1-2 weeks, I'll merge it into
for-next/core for 4.6, otherwise, it will have to wait. I would really
like these patches merged but it looks like they need wider testing.

I have another branch with your kaslr patches but I didn't dare to merge
them into for-next/core until we sort out the first sub-series.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Feb. 15, 2016, 11:53 a.m. UTC | #2
On 15 February 2016 at 12:45, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Mon, Feb 15, 2016 at 10:28:32AM +0100, Ard Biesheuvel wrote:

>> Commit 752af28bd711 ("arm64: move kernel image to base of vmalloc area")

>> moves the mapping of the kernel text and data segments out of the linear

>> region, and takes care not to create a writable alias of the read-only

>> kernel text segment by checking each memblock against overlap when the

>> memblocks are mapped into the linear mapping.

>>

>> However, it is more correct, and much simpler, to mark the [_stext, _etext]

>> interval as MEMBLOCK_NOMAP. This will also prevent the interval from being

>> omitted from the linear region, but this fact will now also be reflected

>> in the output of pfn_valid(), and so code that expects any pfn_valid()

>> page to be mapped and accessible (which is a reasonable assumption) does

>> not get surprised by the text segment being inaccessible via the linear

>> mapping.

>>

>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

>> ---

>>

>> This should hopefully address the issue reported by James, but I suppose

>> more work is required on the hibernate side to ensure the unmapped text

>> region is preserved, since it is no longer covered by the linear mapping.

>

> Ard, I'm about to drop the kernel memory map patches from -next. There

> are several issues like KASAN (which may as well be a KASAN bug but I

> haven't got to the bottom of it yet), initrd (you have patches but

> require additional acks).

>

> I'll keep your stuff on the for-next/kernmap branch and do further

> debugging. If it stabilises in the next 1-2 weeks, I'll merge it into

> for-next/core for 4.6, otherwise, it will have to wait. I would really

> like these patches merged but it looks like they need wider testing.

>


Agreed.

> I have another branch with your kaslr patches but I didn't dare to merge

> them into for-next/core until we sort out the first sub-series.

>


Yes, I noticed. As I pointed out in its cover letter, I expected the
first subseries to be the most problematic in terms of fallout in
other areas, and I turned out to be right, unfortunately. The coverage
so far has been quite useful tbh, especially since I apparently never
tested initrd.

If you are going to drop it for now anyway, I can do a v6sub1 with
this issue and the initrd issue addressed, but also an issue with the
KVM ksym ref patch with GCC 4.8. Since these issues affect
bisectability, I'd prefer to rebase entirely, and fold all the fixes
rather than apply them on top.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Feb. 15, 2016, 11:56 a.m. UTC | #3
On 15 February 2016 at 12:53, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 15 February 2016 at 12:45, Catalin Marinas <catalin.marinas@arm.com> wrote:

>> On Mon, Feb 15, 2016 at 10:28:32AM +0100, Ard Biesheuvel wrote:

>>> Commit 752af28bd711 ("arm64: move kernel image to base of vmalloc area")

>>> moves the mapping of the kernel text and data segments out of the linear

>>> region, and takes care not to create a writable alias of the read-only

>>> kernel text segment by checking each memblock against overlap when the

>>> memblocks are mapped into the linear mapping.

>>>

>>> However, it is more correct, and much simpler, to mark the [_stext, _etext]

>>> interval as MEMBLOCK_NOMAP. This will also prevent the interval from being

>>> omitted from the linear region, but this fact will now also be reflected

>>> in the output of pfn_valid(), and so code that expects any pfn_valid()

>>> page to be mapped and accessible (which is a reasonable assumption) does

>>> not get surprised by the text segment being inaccessible via the linear

>>> mapping.

>>>

>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

>>> ---

>>>

>>> This should hopefully address the issue reported by James, but I suppose

>>> more work is required on the hibernate side to ensure the unmapped text

>>> region is preserved, since it is no longer covered by the linear mapping.

>>

>> Ard, I'm about to drop the kernel memory map patches from -next. There

>> are several issues like KASAN (which may as well be a KASAN bug but I

>> haven't got to the bottom of it yet), initrd (you have patches but

>> require additional acks).

>>

>> I'll keep your stuff on the for-next/kernmap branch and do further

>> debugging. If it stabilises in the next 1-2 weeks, I'll merge it into

>> for-next/core for 4.6, otherwise, it will have to wait. I would really

>> like these patches merged but it looks like they need wider testing.

>>

>

> Agreed.

>

>> I have another branch with your kaslr patches but I didn't dare to merge

>> them into for-next/core until we sort out the first sub-series.

>>

>

> Yes, I noticed. As I pointed out in its cover letter, I expected the

> first subseries to be the most problematic in terms of fallout in

> other areas, and I turned out to be right, unfortunately. The coverage

> so far has been quite useful tbh, especially since I apparently never

> tested initrd.

>

> If you are going to drop it for now anyway, I can do a v6sub1 with

> this issue and the initrd issue addressed, but also an issue with the

> KVM ksym ref patch with GCC 4.8.


Note that this is a transient issue that is gone after applying the
subsequent patch, but it would still be nice to get rid of it if we're
going to do another take anyway.

> Since these issues affect

> bisectability, I'd prefer to rebase entirely, and fold all the fixes

> rather than apply them on top.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Catalin Marinas Feb. 15, 2016, 12:08 p.m. UTC | #4
On Mon, Feb 15, 2016 at 12:53:59PM +0100, Ard Biesheuvel wrote:
> On 15 February 2016 at 12:45, Catalin Marinas <catalin.marinas@arm.com> wrote:

> > Ard, I'm about to drop the kernel memory map patches from -next. There

> > are several issues like KASAN (which may as well be a KASAN bug but I

> > haven't got to the bottom of it yet), initrd (you have patches but

> > require additional acks).

> >

> > I'll keep your stuff on the for-next/kernmap branch and do further

> > debugging. If it stabilises in the next 1-2 weeks, I'll merge it into

> > for-next/core for 4.6, otherwise, it will have to wait. I would really

> > like these patches merged but it looks like they need wider testing.

[...]
> If you are going to drop it for now anyway, I can do a v6sub1 with

> this issue and the initrd issue addressed, but also an issue with the

> KVM ksym ref patch with GCC 4.8. Since these issues affect

> bisectability, I'd prefer to rebase entirely, and fold all the fixes

> rather than apply them on top.


That's fine. I'll push one more patch to the for-next/pgtable branch,
converting __set_fixmap_offset() back to a macro as it broke other
architectures. Otherwise it is not rebased and I'll merge it into
for-next core.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Feb. 15, 2016, 5:17 p.m. UTC | #5
On 15 February 2016 at 13:08, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Mon, Feb 15, 2016 at 12:53:59PM +0100, Ard Biesheuvel wrote:

>> On 15 February 2016 at 12:45, Catalin Marinas <catalin.marinas@arm.com> wrote:

>> > Ard, I'm about to drop the kernel memory map patches from -next. There

>> > are several issues like KASAN (which may as well be a KASAN bug but I

>> > haven't got to the bottom of it yet), initrd (you have patches but

>> > require additional acks).

>> >

>> > I'll keep your stuff on the for-next/kernmap branch and do further

>> > debugging. If it stabilises in the next 1-2 weeks, I'll merge it into

>> > for-next/core for 4.6, otherwise, it will have to wait. I would really

>> > like these patches merged but it looks like they need wider testing.

> [...]

>> If you are going to drop it for now anyway, I can do a v6sub1 with

>> this issue and the initrd issue addressed, but also an issue with the

>> KVM ksym ref patch with GCC 4.8. Since these issues affect

>> bisectability, I'd prefer to rebase entirely, and fold all the fixes

>> rather than apply them on top.

>

> That's fine. I'll push one more patch to the for-next/pgtable branch,

> converting __set_fixmap_offset() back to a macro as it broke other

> architectures. Otherwise it is not rebased and I'll merge it into

> for-next core.

>


I have identified (and fixed) a Kasan issue in my patches: instead of
leaving a hole for the 64 MB module window in the Kasan shadow region,
I populated it with zero mappings, which caused many strange errors,
and since the vmap() routines don't expect block mappings (which are
created for the kernel image shadow by vmemmap_populate()), I guess
false positive kasan warnings (like the ones you were seeing) are also
a possible symptom of this issue.

-- 
Ard.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Catalin Marinas Feb. 15, 2016, 5:35 p.m. UTC | #6
On Mon, Feb 15, 2016 at 06:17:46PM +0100, Ard Biesheuvel wrote:
> On 15 February 2016 at 13:08, Catalin Marinas <catalin.marinas@arm.com> wrote:

> > On Mon, Feb 15, 2016 at 12:53:59PM +0100, Ard Biesheuvel wrote:

> >> On 15 February 2016 at 12:45, Catalin Marinas <catalin.marinas@arm.com> wrote:

> >> > Ard, I'm about to drop the kernel memory map patches from -next. There

> >> > are several issues like KASAN (which may as well be a KASAN bug but I

> >> > haven't got to the bottom of it yet), initrd (you have patches but

> >> > require additional acks).

> >> >

> >> > I'll keep your stuff on the for-next/kernmap branch and do further

> >> > debugging. If it stabilises in the next 1-2 weeks, I'll merge it into

> >> > for-next/core for 4.6, otherwise, it will have to wait. I would really

> >> > like these patches merged but it looks like they need wider testing.

> > [...]

> >> If you are going to drop it for now anyway, I can do a v6sub1 with

> >> this issue and the initrd issue addressed, but also an issue with the

> >> KVM ksym ref patch with GCC 4.8. Since these issues affect

> >> bisectability, I'd prefer to rebase entirely, and fold all the fixes

> >> rather than apply them on top.

> >

> > That's fine. I'll push one more patch to the for-next/pgtable branch,

> > converting __set_fixmap_offset() back to a macro as it broke other

> > architectures. Otherwise it is not rebased and I'll merge it into

> > for-next core.

> 

> I have identified (and fixed) a Kasan issue in my patches: instead of

> leaving a hole for the 64 MB module window in the Kasan shadow region,

> I populated it with zero mappings, which caused many strange errors,

> and since the vmap() routines don't expect block mappings (which are

> created for the kernel image shadow by vmemmap_populate()),


This would probably fix the vmalloc warning caused from kasan when
loading modules.

> I guess false positive kasan warnings (like the ones you were seeing)

> are also a possible symptom of this issue.


I now get kasan warnings even on vanilla 4.5-rc1 (though not on 4.4).
I'm trying to figure out whether they are genuine or just false
positives:

BUG: KASAN: stack-out-of-bounds in clockevents_program_event+0x2c8/0x310 at addr ffffffc0016ffc08
Read of size 8 by task swapper/0/0
page:ffffffbdc205bfc0 count:1 mapcount:0 mapping:          (null) index:0x0
flags: 0x400(reserved)
page dumped because: kasan: bad access detected
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G    B           4.5.0-rc1+ #158
Hardware name: Juno (DT)
Call trace:
[<ffffffc00008f130>] dump_backtrace+0x0/0x358
[<ffffffc00008f49c>] show_stack+0x14/0x20
[<ffffffc000785d40>] dump_stack+0xf8/0x188
[<ffffffc000343bb4>] kasan_report_error+0x524/0x550
[<ffffffc000343cf8>] __asan_report_load8_noabort+0x40/0x48
[<ffffffc0001f2b38>] clockevents_program_event+0x2c8/0x310
[<ffffffc0001f737c>] tick_program_event+0xac/0x108
[<ffffffc0001d85c8>] hrtimer_start_range_ns+0x8a0/0xb20
[<ffffffc0001f8b50>] __tick_nohz_idle_enter+0x970/0xca8
[<ffffffc0001f9310>] tick_nohz_idle_enter+0x60/0x98
[<ffffffc0001933ec>] cpu_startup_entry+0x14c/0x448
[<ffffffc00119104c>] rest_init+0xc4/0xd0
[<ffffffc00161beec>] start_kernel+0x54c/0x578
[<ffffffc0000811e0>] 0xffffffc0000811e0
Memory state around the buggy address:
 ffffffc0016ffb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffffc0016ffb80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>ffffffc0016ffc00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00

                      ^
 ffffffc0016ffc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffffc0016ffd00: 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Feb. 16, 2016, 8:49 a.m. UTC | #7
On 15 February 2016 at 13:08, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Mon, Feb 15, 2016 at 12:53:59PM +0100, Ard Biesheuvel wrote:

>> On 15 February 2016 at 12:45, Catalin Marinas <catalin.marinas@arm.com> wrote:

>> > Ard, I'm about to drop the kernel memory map patches from -next. There

>> > are several issues like KASAN (which may as well be a KASAN bug but I

>> > haven't got to the bottom of it yet), initrd (you have patches but

>> > require additional acks).

>> >

>> > I'll keep your stuff on the for-next/kernmap branch and do further

>> > debugging. If it stabilises in the next 1-2 weeks, I'll merge it into

>> > for-next/core for 4.6, otherwise, it will have to wait. I would really

>> > like these patches merged but it looks like they need wider testing.

> [...]

>> If you are going to drop it for now anyway, I can do a v6sub1 with

>> this issue and the initrd issue addressed, but also an issue with the

>> KVM ksym ref patch with GCC 4.8. Since these issues affect

>> bisectability, I'd prefer to rebase entirely, and fold all the fixes

>> rather than apply them on top.

>

> That's fine. I'll push one more patch to the for-next/pgtable branch,

> converting __set_fixmap_offset() back to a macro as it broke other

> architectures. Otherwise it is not rebased and I'll merge it into

> for-next core.

>


If you are doing for-next/core from scratch, perhaps it is better to
squash the fixmap patches as well?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Feb. 16, 2016, 10:57 a.m. UTC | #8
On 16 February 2016 at 11:50, James Morse <james.morse@arm.com> wrote:
> Hi Ard,

>

> On 15/02/16 09:28, Ard Biesheuvel wrote:

>> Commit 752af28bd711 ("arm64: move kernel image to base of vmalloc area")

>> moves the mapping of the kernel text and data segments out of the linear

>> region, and takes care not to create a writable alias of the read-only

>> kernel text segment by checking each memblock against overlap when the

>> memblocks are mapped into the linear mapping.

>>

>> However, it is more correct, and much simpler, to mark the [_stext, _etext]

>> interval as MEMBLOCK_NOMAP. This will also prevent the interval from being

>> omitted from the linear region, but this fact will now also be reflected

>> in the output of pfn_valid(), and so code that expects any pfn_valid()

>> page to be mapped and accessible (which is a reasonable assumption) does

>> not get surprised by the text segment being inaccessible via the linear

>> mapping.

>>

>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

>> ---

>

> With this patch on arm64/for-next/kernmap, I get the following KVM-related boot

> failure on Juno:

> -----------------------------%<-----------------------------

> RPC: Registered named UNIX socket transport module.

> RPC: Registered udp transport module.

> RPC: Registered tcp transport module.

> RPC: Registered tcp NFSv4.1 backchannel transport module.

> hw perfevents: enabled with armv8_pmuv3 PMU driver, 7 counters available

> ------------[ cut here ]------------

> kernel BUG at ../arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:577!

> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP

> Modules linked in:

> CPU: 4 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc1+ #2022

> Hardware name: ARM Juno development board (r1) (DT)

> task: ffffffc976cf8000 ti: ffffffc976d60000 task.ti: ffffffc976d60000

> PC is at create_hyp_mappings+0x144/0x148

> LR is at create_hyp_mappings+0x7c/0x148

>

> [SNIP]

>

> [<ffffff80040abc14>] create_hyp_mappings+0x144/0x148

> [<ffffff80040a9be8>] kvm_arch_init+0x198/0x53c

> [<ffffff800409fe24>] kvm_init+0x38/0x290

> [<ffffff80040a84e4>] arm_init+0x24/0x2c

> [<ffffff8004082994>] do_one_initcall+0x94/0x198

> [<ffffff8004a22b10>] kernel_init_freeable+0x148/0x1e8

> [<ffffff800473e344>] kernel_init+0x20/0xe4

> [<ffffff8004085cd0>] ret_from_fork+0x10/0x40

>

> -----------------------------%<-----------------------------

>

> The offending function is:

> -----------------------------%<-----------------------------

> static phys_addr_t kvm_kaddr_to_phys(void *kaddr)

> {

>         if (!is_vmalloc_addr(kaddr)) {

>                 BUG_ON(!virt_addr_valid(kaddr));

>                 return __pa(kaddr);

>         } else {

>                 return page_to_phys(vmalloc_to_page(kaddr)) +

>                        offset_in_page(kaddr);

>         }

> }

> -----------------------------%<-----------------------------

>

> When the BUG_ON() fires, kaddr points into the hole in the linear map left by

> the kernel text.

>


I fixed this in the v6 series I am about to send out. I replaced the
BUG_ON() condition with 'memblock_is_memory(__pa(kaddr))', so that it
takes linear addresses pointing into  MEMBLOCK_NOMAP regions into
account as well.

>

>> This should hopefully address the issue reported by James, but I suppose

>> more work is required on the hibernate side to ensure the unmapped text

>> region is preserved, since it is no longer covered by the linear mapping.

>

> It does, now I have to work out how best to fix hibernate!

>


Indeed. The alternative would be to map the [_stext, _etext] interval
into the linear region as a ro/nx alias of the actual mapping in the
vmalloc region. That would probably make your life easier, but I am
not convinced the overall result is better.

-- 
Ard.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Catalin Marinas Feb. 16, 2016, 10:57 a.m. UTC | #9
On Tue, Feb 16, 2016 at 09:49:48AM +0100, Ard Biesheuvel wrote:
> On 15 February 2016 at 13:08, Catalin Marinas <catalin.marinas@arm.com> wrote:

> > On Mon, Feb 15, 2016 at 12:53:59PM +0100, Ard Biesheuvel wrote:

> >> On 15 February 2016 at 12:45, Catalin Marinas <catalin.marinas@arm.com> wrote:

> >> > Ard, I'm about to drop the kernel memory map patches from -next. There

> >> > are several issues like KASAN (which may as well be a KASAN bug but I

> >> > haven't got to the bottom of it yet), initrd (you have patches but

> >> > require additional acks).

> >> >

> >> > I'll keep your stuff on the for-next/kernmap branch and do further

> >> > debugging. If it stabilises in the next 1-2 weeks, I'll merge it into

> >> > for-next/core for 4.6, otherwise, it will have to wait. I would really

> >> > like these patches merged but it looks like they need wider testing.

> > [...]

> >> If you are going to drop it for now anyway, I can do a v6sub1 with

> >> this issue and the initrd issue addressed, but also an issue with the

> >> KVM ksym ref patch with GCC 4.8. Since these issues affect

> >> bisectability, I'd prefer to rebase entirely, and fold all the fixes

> >> rather than apply them on top.

> >

> > That's fine. I'll push one more patch to the for-next/pgtable branch,

> > converting __set_fixmap_offset() back to a macro as it broke other

> > architectures. Otherwise it is not rebased and I'll merge it into

> > for-next core.

> 

> If you are doing for-next/core from scratch, perhaps it is better to

> squash the fixmap patches as well?


I could, assuming you were the only one using the for-next/pgtable
branch and I won't get pull requests against it (I usually cherry-pick
the patches, so not a problem).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Feb. 16, 2016, 12:11 p.m. UTC | #10
On 16 February 2016 at 12:13, James Morse <james.morse@arm.com> wrote:
> On 16/02/16 10:57, Ard Biesheuvel wrote:

>> On 16 February 2016 at 11:50, James Morse <james.morse@arm.com> wrote:

>>> On 15/02/16 09:28, Ard Biesheuvel wrote:

>>>> This should hopefully address the issue reported by James, but I suppose

>>>> more work is required on the hibernate side to ensure the unmapped text

>>>> region is preserved, since it is no longer covered by the linear mapping.

>>>

>>> It does, now I have to work out how best to fix hibernate!

>>>

>>

>> Indeed. The alternative would be to map the [_stext, _etext] interval

>> into the linear region as a ro/nx alias of the actual mapping in the

>> vmalloc region. That would probably make your life easier, but I am

>> not convinced the overall result is better.

>

> I assumed you would NACK any approach that looked like that:

> It could be done temporarily while the memory snapshot is being taken, only

> swsusp_save() would see this mapping, it can be removed again afterwards. But

> pfn_valid() will still say those pages aren't there, and there is no

> memblock_clear_nomap() to change this.

>


Actually, we may have memblock_clear_nomap() at some point, since we
need it to simply relocate_initrd(), i.e., moving the initrd if its
memory is not covered by the linear mapping is much more complicated
than simply adding its memory to the linear mapping, and that requires
clearing the NOMAP flag as well if the region intersects with the UEFI
memory map (this is a corner case but one that does occur in practice)

But in this case, clearing the NOMAP flag only to appease pfn_valid()
is not the right approach, imo. Once the linear mapping is created, we
should not change the underlying memblock regions anymore. Also, what
pfn_valid() tells you should be in sync with what mappings are
actually present in the linear region (i.e., the issue you hit
originally)

> The current approach involves copying the kernel text into pages that are saved,

> and having two lists of pages that need restoring. (wheels within wheels!)

> Pretty unpleasant, and requires another patch in the core code, so likely won't

> be possible for v4.6. I should have the patches out for review later today.

>


I think a ro/nx alias in the linear region for [_stext, _etext] is
pretty harmless, and much preferred over having to add hacks on top of
generic layers like hibernate. The cost in terms of page tables and
TLB entries is zero, since the adjacent regions are mapped anyway, and
the region is never accessed during normal use.

So I will fold that in before sending out my v6.

-- 
Ard.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff mbox

Patch

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 023c41f22b5b..e895fb6ff9dd 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -218,6 +218,7 @@  void __init arm64_memblock_init(void)
 	 * pagetables with memblock.
 	 */
 	memblock_reserve(__pa(_text), _end - _text);
+	memblock_mark_nomap(__pa(_stext), _etext - _stext);
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (initrd_start) {
 		memblock_reserve(initrd_start, initrd_end - initrd_start);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 5d7e0b801ab7..5ca2f315ba9d 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -352,41 +352,6 @@  static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 			     late_pgtable_alloc);
 }
 
-static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end)
-{
-
-	unsigned long kernel_start = __pa(_stext);
-	unsigned long kernel_end = __pa(_etext);
-
-	/*
-	 * Take care not to create a writable alias for the
-	 * read-only text and rodata sections of the kernel image.
-	 */
-
-	/* No overlap with the kernel text */
-	if (end < kernel_start || start >= kernel_end) {
-		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
-				     end - start, PAGE_KERNEL,
-				     early_pgtable_alloc);
-		return;
-	}
-
-	/*
-	 * This block overlaps the kernel text mapping. Map the portion(s) which
-	 * don't overlap.
-	 */
-	if (start < kernel_start)
-		__create_pgd_mapping(pgd, start,
-				     __phys_to_virt(start),
-				     kernel_start - start, PAGE_KERNEL,
-				     early_pgtable_alloc);
-	if (kernel_end < end)
-		__create_pgd_mapping(pgd, kernel_end,
-				     __phys_to_virt(kernel_end),
-				     end - kernel_end, PAGE_KERNEL,
-				     early_pgtable_alloc);
-}
-
 static void __init map_mem(pgd_t *pgd)
 {
 	struct memblock_region *reg;
@@ -401,7 +366,9 @@  static void __init map_mem(pgd_t *pgd)
 		if (memblock_is_nomap(reg))
 			continue;
 
-		__map_memblock(pgd, start, end);
+		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
+				     end - start, PAGE_KERNEL,
+				     early_pgtable_alloc);
 	}
 }