Impact: fix crash during hibernation on 32-bit NUMA
The NUMA code on x86_32 creates special memory mapping that allows
each node's pgdat to be located in this node's memory. For this
purpose it allocates a memory area at the end of each node's memory
and maps this area so that it is accessible with virtual addresses
belonging to low memory. As a result, if there is high memory,
these NUMA-allocated areas are physically located in high memory,
although they are mapped to low memory addresses.
Our hibernation code does not take that into account and for this
reason hibernation fails on all x86_32 systems with CONFIG_NUMA=y and
with high memory present. Fix this by adding a special mapping for
the NUMA-allocated memory areas to the temporary page tables created
during the last phase of resume.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Don't do misalignment handling for userspace misalignment faults: just
generate an appropriate SIGBUS instead.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't handle the misaligned loading and storing of the SP register as in C code
that's most certainly a compiler bug.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Support misalignment handling for instructions that have kernel SP-based
address operands, including fixing those that include IMM8 or IMM16
displacements.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix misalignment handling for an address calculated from the sum of two
registers.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Correctly handle misalignment in MOV instructions with postinc-with-immediate
addressing mode operands. In these, the immediate value is the increment to
be applied the address register, not the displacement to the address.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Perform misalignment fixups of the MOV_Lcc instructions (move postinc memory
to register and conditionally loop).
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow misalignment fixup in interrupt handling in the MN10300 arch.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix misalignment handling of operands with register postincrement addressing.
The flag to indicate that postincrement is required should not be interpreted
as an specification of a value to be added to the address.
Also add BUGs to catch unimplemented parameter markings in the opcodes table.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Extract the displacement from an MN10300 instruction correctly in the
misalignment fixup handler.
The code should extract the displacement in LSB order, not MSB order.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add further misalignment fixup support to the MN10300 arch, notably for ABS32
and SP+disp addressing.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Clean up the MN10300 misalignment handler a little by:
(1) Use ilog2() rather than doing implementing log2() locally.
(2) Make format_tbl[] const and static.
(3) Making the debugging prints more consistent.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the PMB enabled, only P1SEG and up are covered by the PMB mappings,
meaning that situations where out-of-bounds physical addresses are read
from will lead to TLB reset after the PMB miss, allowing for use cases
like dd if=/dev/mem to reset the TLB.
Fix this up to make sure the reference is between __MEMORY_START (phys)
and __pa(high_memory). This is coherent across all variants of sh/sh64
with and without MMU, though the PMB bug itself is only applicable to
SH-4A parts.
Reported-by: Hideo Saito <saito@densan.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch changes sci type of SH7723 from PORT_SCI to PORT_SCIFA.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This removes the acpi_irq_balance_set() interface from the PCI
interrupt link driver.
x86 used acpi_irq_balance_set() to tell the PCI interrupt link
driver to configure links to minimize IRQ sharing. But the link
driver can easily figure out whether to turn on IRQ balancing
based on the IRQ model (PIC/IOAPIC/etc), so we can get rid of
that external interface.
It's better for the driver to figure this out at init-time. If
we set it externally via the x86 code, the interface reduces
modularity, and we depend on the fact that acpi_process_madt()
happens before we process the kernel command line.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Now that the critical read back to flush the next descriptor address is
fixed we can downgrade some BUG_ONs that need only be enabled when testing
changes to the driver.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In the case of !CONFIG_SMP, raw_spinlock_t is empty and the spinlock functions
don't build. Fix by defining spinlock functions for the uniprocessor case.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
There is a potential issue that, when guest using pagetable without vmexit when
EPT enabled, guest would use PAT/PCD/PWT bits to index PAT msr for it's memory,
which would be inconsistent with host side and would cause host MCE due to
inconsistent cache attribute.
The patch set IGMT bit in EPT entry to ignore guest PAT and use WB as default
memory type to protect host (notice that all memory mapped by KVM should be WB).
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
WARNING: arch/x86/kernel/built-in.o(.text+0x1722c): Section mismatch
in reference from the function kvm_setup_secondary_clock() to the
function .devinit.text:setup_secondary_APIC_clock()
The function kvm_setup_secondary_clock() references
the function __devinit setup_secondary_APIC_clock().
This is often because kvm_setup_secondary_clock lacks a __devinit
annotation or the annotation of setup_secondary_APIC_clock is wrong.
Signed-off-by: Md.Rakib H. Mullick <rakib.mullick@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
Before a vcpu blocks, it should switch to the guest signal mask to allow
signals to unblock it.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The page fault path can use two rmap_desc structures, if:
- walk_addr's dirty pte update allocates one rmap_desc.
- mmu_lock is dropped, sptes are zapped resulting in rmap_desc being
freed.
- fetch->mmu_set_spte allocates another rmap_desc.
Increase to 4 for safety.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Impact: build/boot fix for x86/Voyager
This change:
| commit 3d44223327
| Author: Jens Axboe <jens.axboe@oracle.com>
| Date: Thu Jun 26 11:21:34 2008 +0200
|
| Add generic helpers for arch IPI function calls
didn't wire up the voyager smp call function correctly, so do that
here. Also make CONFIG_USE_GENERIC_SMP_HELPERS a def_bool y again,
since we now use the generic helpers for every x86 architecture.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <Jens.Axboe@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Turned off CONFIG_PCI_LEGACY and turned on EXT4, and otherwise mostly
took the defaults. This also updates ppc6xx_defconfig, which covers
the 6xx/7xx/7xxx-based embedded boards.
Signed-off-by: Paul Mackerras <paulus@samba.org>
The new context may not be 16-byte aligned, so the real address of the
mcontext structure should be read from the uc_regs pointer instead of
directly using the (unaligned) uc_mcontext field.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
While investigating the failure of hibernation on 32-bit x86 with
CONFIG_NUMA set, as described in this message
http://marc.info/?l=linux-kernel&m=122634118116226&w=4
I asked some people for help and I was told that it wasn't really
worth the effort, because CONFIG_NUMA was generally broken on 32-bit
x86 systems and it shouldn't be used in such configs. For this
reason, make CONFIG_NUMA depend on BROKEN instead of EXPERIMENTAL on
x86-32.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Pavel Machek <pavel@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 8dd9453737.
This fixes a boot failure reported by Robert Reif.
The code above the section change expects to fallthrough, so
we can't make such a section change here.
Some functions that may be called from this handler require that
interrupts are disabled. Also, combining IRQF_DISABLED and
IRQF_SHARED does not reliably disable interrupts in a handler, so
remove IRQF_SHARED from the irq flags (this irq is not shared anyway).
Signed-off-by: Matt Fleming <mjf@gentoo.org>
Cc: mingo@elte.hu
Cc: venkatesh.pallipadi@intel.com
Cc: "Will Newton" <will.newton@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In hpet_next_event() we check that the value we just wrote to
HPET_Tn_CMP(timer) has reached the chip. Currently, we're checking that
the value we wrote to HPET_Tn_CMP(timer) is in HPET_T0_CMP, which, if
timer is anything other than timer 0, is likely to fail.
Signed-off-by: Matt Fleming <mjf@gentoo.org>
Cc: mingo@elte.hu
Cc: venkatesh.pallipadi@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It is possible to flood the console with call traces if the WARN_ON
condition is true because of the frequency with which this function is
called.
Signed-off-by: Matt Fleming <mjf@gentoo.org>
Cc: mingo@elte.hu
Cc: venkatesh.pallipadi@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There was a race in the kmap_coherent() implementation. While we
guarded against preemption, there was nothing preventing eviction of
the pre-faulted fixmap entry from the UTLB. Under certain workloads
this would result in the fixmap entries used for cache colouring being
evicted from the UTLB in the midst of a copy_page().
In addition to pre-faulting, we also make sure to preserve the PTEs
in the kernel page table and introduce a cached PTE for kmap_coherent()
usage. This follows a similar change on MIPS ("[MIPS] Fix aliasing bug
in copy_to_user_page / copy_from_user_page").
Reported-by: Hideo Saito <saito@densan.co.jp>
Reported-by: CHIKAMA Masaki <masaki.chikama@gmail.com>
Tested-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
a new file was accidentally added to include/asm-x86;
move it to the new arch/x86/include/asm location
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Mikael Pettersson reported:
The 2.6.28-rc kernels fail to detect PCI device 0000:00:01.0
(the first ethernet port) on my Thecus n2100 XScale box.
There is however still a strange "ghost" device that gets partially
detected in 2.6.28-rc2 vanilla.
The IOP321 manual says:
The user designates the memory region containing the OCCDR as
non-cacheable and non-bufferable from the IntelR XScaleTM core.
This guarantees that all load/stores to the OCCDR are only of
DWORD quantities.
Ensure that the OCCDR is so mapped.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Same fix as commit c7cf72dcadb: when 'start' and 'end' are less than a
cacheline apart and 'start' is unaligned we are done after cleaning and
invalidating the first cacheline.
Cc: <stable@kernel.org>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
unset CONFIG_PCI_LEGACY in the defconfigs as none of them enable
ISDN drivers which seem to be the only place we are using pci_find_device
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Fixes following build error:
CC drivers/usb/gadget/fsl_qe_udc.o
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_eprx_stall_change':
drivers/usb/gadget/fsl_qe_udc.c:156: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:163: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_eptx_stall_change':
drivers/usb/gadget/fsl_qe_udc.c:173: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:180: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_eprx_nack':
drivers/usb/gadget/fsl_qe_udc.c:201: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:201: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_eprx_normal':
drivers/usb/gadget/fsl_qe_udc.c:218: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:218: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_ep_reset':
drivers/usb/gadget/fsl_qe_udc.c:325: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:342: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_ep_register_init':
drivers/usb/gadget/fsl_qe_udc.c:515: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'ch9getstatus':
drivers/usb/gadget/fsl_qe_udc.c:1981: error: 'struct usb_ctlr' has no member named 'usb_usep'
make[2]: *** [drivers/usb/gadget/fsl_qe_udc.o] Error 1
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch corrects the bus-frequency value provided in the SBC610's dts.
Signed-off-by: Martyn Welch <martyn.welch@gefanuc.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
variant of __cycles_2_ns().
Most of the time sched_clock() is called with irqs disabled already.
The few places that call it with irqs enabled need to be updated.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
in scheduler-intense workloads native_read_tsc() overhead accounts for
20% of the system overhead:
659567 system_call 41222.9375
686796 schedule 435.7843
718382 __switch_to 665.1685
823875 switch_mm 4526.7857
1883122 native_read_tsc 55385.9412
9761990 total 2.8468
this is large part due to the rdtsc_barrier() that is done before
and after reading the TSC.
But sched_clock() is not a precise clock in the GTOD sense, using such
barriers is completely pointless. So remove the barriers and only use
them in vget_cycles().
This improves lat_ctx performance by about 5%.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
IA64 kdump kernel failed to initialize /proc/vmcore in 2.6.28-rc2.
A bug was introduced in this patch commit:
d9a9855d0b
always reserve elfcore header memory in crash kernel
The problem was that the call to reserve_elfcorehdr() should be placed
in CONFIG_CRASH_DUMP rather than in CONFIG_CRASH_KERNEL, which does
not exist.
Signed-off-by: Jay Lan <jlan@sgi.com>
Acked-by: Simon Hormon <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix the counter overflow check for CPUs with counter width > 32
I had a similar change in a different patch that I didn't submit
and I didn't notice the problem earlier because it was always
tested together.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Xen requires that all mappings of pagetable pages are read-only, so
that they can't be updated illegally. As a result, if a page is being
turned into a pagetable page, we need to make sure all its mappings
are RO.
If the page had been used for ioremap or vmalloc, it may still have
left over mappings as a result of not having been lazily unmapped.
This change makes sure we explicitly mop them all up before pinning
the page.
Unlike aliases created by kmap, the there can be vmalloc aliases even
for non-high pages, so we must do the flush unconditionally.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix 32-bit Xen guest boot crash
On 32-bit PAE, pud_page, for no good reason, didn't really return a
struct page *. Since Jan Beulich's fix "i386/PAE: fix pud_page()",
pud_page does return a struct page *.
Because PAE has 3 pagetable levels, the pud level is folded into the
pgd level, so pgd_page() is the same as pud_page(), and now returns
a struct page *. Update the xen/mmu.c code which uses pgd_page()
accordingly.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When 'start' and 'end' are less than a cacheline apart and 'start' is
unaligned we are done after cleaning and invalidating the first
cacheline. So check for (start < end) which will not walk off into
invalid address ranges when (start > end).
This issue was caught by drivers/dma/dmatest.
2.6.27 is susceptible.
Cc: <stable@kernel.org>
Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Lothar WaÃ<9f>mann <LW@KARO-electronics.de>
Cc: Lennert Buytenhek <buytenh@marvell.com>
Cc: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>