Add missing ATA_* defines to <linux/ata.h>. Also add
ATAPI_{LFS,EOM,ILI,IO,CODE} defines while at it.
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add missing ATA_CMD_* defines to <linux/ata.h>. Also add
ATA_EXABYTE_ENABLE_NEST, SETFEATURES_AAM_* and ATA_SMART_*
defines while at it.
Partially based on earlier work by Chris Wedgwood.
Acked-by: Chris Wedgwood <cw@f00f.org>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add missing ATA_ID_* defines and update {ata,atapi}_*()
inlines accordingly. The currently unused defines are
needed for the forthcoming drivers/ide/ changes.
v2:
Add ATA_ID_SPG.
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Quoting Mike Travis in "x86: cleanup early per cpu variables/accesses v4"
(23ca4bba3e):
The DEFINE macro defines the per_cpu variable as well as the early
map and pointer. It also initializes the per_cpu variable and map
elements to "_initvalue". The early_* macros provide access to
the initial map (usually setup during system init) and the early
pointer. This pointer is initialized to point to the early map
but is then NULL'ed when the actual per_cpu areas are setup. After
that the per_cpu variable is the correct access to the variable.
As these variables are NULL'ed before __init sections are dropped
(in setup_per_cpu_maps), they can be safely annotated as __ref.
This change silences following section mismatch warnings:
WARNING: vmlinux.o(.data+0x46c0): Section mismatch in reference from the variable x86_cpu_to_apicid_early_ptr to the variable .init.data:x86_cpu_to_apicid_early_map
The variable x86_cpu_to_apicid_early_ptr references
the variable __initdata x86_cpu_to_apicid_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
WARNING: vmlinux.o(.data+0x46c8): Section mismatch in reference from the variable x86_bios_cpu_apicid_early_ptr to the variable .init.data:x86_bios_cpu_apicid_early_map
The variable x86_bios_cpu_apicid_early_ptr references
the variable __initdata x86_bios_cpu_apicid_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
WARNING: vmlinux.o(.data+0x46d0): Section mismatch in reference from the variable x86_cpu_to_node_map_early_ptr to the variable .init.data:x86_cpu_to_node_map_early_map
The variable x86_cpu_to_node_map_early_ptr references
the variable __initdata x86_cpu_to_node_map_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1591): Section mismatch in reference from the function init_amd() to the function .init.text:check_enable_amd_mmconf_dmi()
The function __cpuinit init_amd() references
a function __init check_enable_amd_mmconf_dmi().
If check_enable_amd_mmconf_dmi is only used by init_amd then
annotate check_enable_amd_mmconf_dmi with a matching annotation.
check_enable_amd_mmconf_dmi is only called from init_amd which is __cpuinit
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86_64 add/sub atomic ops does not seems to accept integer values bigger
than 32 bits as immediates. Intel's add/sub documentation specifies they
have to be passed as registers.
The only operations in the x86-64 architecture which accept arbitrary
64-bit immediates is "movq" to any register; similarly, the only
operation which accept arbitrary 64-bit displacement is "movabs" to or
from al/ax/eax/rax.
http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Machine-Constraints.html
states :
e
32-bit signed integer constant, or a symbolic reference known to fit
that range (for immediate operands in sign-extending x86-64
instructions).
Z
32-bit unsigned integer constant, or a symbolic reference known to
fit that range (for immediate operands in zero-extending x86-64
instructions).
Since add/sub does sign extension, using the "e" constraint seems appropriate.
It applies to 2.6.27-rc, 2.6.26, 2.6.25...
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
security.h: fix build failure
include/linux/security.h: In function 'security_ptrace_traceme':
include/linux/security.h:1760: error: 'parent' undeclared (first use in this function)
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: James Morris <jmorris@namei.org>
The exported copy of videodev2.h contains this line:
#define #include <sys/time.h>
This is because for some reason it defines __user for itself -- despite
the fact that we remove all instances of __user when exporting headers.
_All_ pointers in userspace are user pointers. Fix it by removing the
unnecessary '#define __user' from the file.
The new headers ivtv.h and ivtvfb.h would have the same problem... if
whoever put them there had actually remembered to add them to the Kbuild
file while he was at it. Fix those too, and export them as was
presumably intended.
Note that includes of <linux/compiler.h> are also stripped by the header
export process, so those don't need to be conditional.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Try to comment away a little of the confusion between mm's vm_area_struct
vm_flags and vmalloc's vm_struct flags: based on an idea by Ulrich Drepper.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes CVS keywords that weren't updated for a long time.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
as per this discussion:
http://lkml.org/lkml/2008/8/12/423
Pardo reported that 64-bit threaded apps, if their stacks exceed the
combined size of ~4GB, slow down drastically in pthread_create() - because
glibc uses MAP_32BIT to allocate the stacks. The use of MAP_32BIT is
a legacy hack - to speed up context switching on certain early model
64-bit P4 CPUs.
So introduce a new flag to be used by glibc instead, to not constrain
64-bit apps like this.
glibc can switch to this new flag straight away - it will be ignored
by the kernel. If those old CPUs ever matter to anyone, support for
it can be implemented.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ulrich Drepper <drepper@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This moves it to being a tty operation. That removes special cases and now
also means that resize can be picked up by um and other non vt consoles
which may have a resize operation.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
as per this discussion:
http://lkml.org/lkml/2008/8/12/423
Pardo reported that 64-bit threaded apps, if their stacks exceed the
combined size of ~4GB, slow down drastically in pthread_create() - because
glibc uses MAP_32BIT to allocate the stacks. The use of MAP_32BIT is
a legacy hack - to speed up context switching on certain early model
64-bit P4 CPUs.
So introduce a new flag to be used by glibc instead, to not constrain
64-bit apps like this.
glibc can switch to this new flag straight away - it will be ignored
by the kernel. If those old CPUs ever matter to anyone, support for
it can be implemented.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ulrich Drepper <drepper@gmail.com>
Sam Ravnborg did the build-test that the direct header file move works,
I'm just committing it.
This is a pure move:
mkdir arch/alpha/include
git mv include/asm-alpha arch/alpha/include/asm
with no other changes.
Requested-and-tested-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
m68k fails to build with these functions inlined in completion.h. Move
them out of line into sched.c and export them to avoid this problem.
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add __ftrace_enabled_save/restore, used to disable ftrace for a while.
Now, this is used by kexec jump, which need a version without lock, for
general situation, a locked version should be used.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kexec/Kexec-jump require code size in control page is less than
PAGE_SIZE/2. This patch add link-time checking for this.
ASSERT() of ld link script is used as the link-time checking mechanism.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE, because control
page is used for not only code on some platform. For example in kexec
jump, it is used for data and stack too.
[akpm@linux-foundation.org: unbreak powerpc and arm, finish conversion]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WARNING: vmlinux.o(.text+0xe684): Section mismatch in reference from the function register_nosave_region() to the function .init.text:__register_nosave_region()
The function register_nosave_region() references
the function __init __register_nosave_region().
This is often because register_nosave_region lacks a __init
annotation or the annotation of __register_nosave_region is wrong.
register_nosave_region calls __init function and is called only from
__init functions
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Plus add a build time check so this doesn't go unnoticed again.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adds a simple IRQ autodetection to the AMD Geode MFGPT driver, and more
importantly, adds some checks, if IRQs can actually be received on the
chosen line. This fixes cases where MFGPT is selected as clocksource
though not producing any ticks, so the kernel simply starves during
boot.
Signed-off-by: Jens Rottmann <JRottmann@LiPPERTEmbedded.de>
Cc: Andres Salomon <dilinger@debian.org>
Cc: linux-geode@bombadil.infradead.org
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The masked difference is what needs to be compared against 1, rather
than the difference of masked values (which can be negative).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Building 2.6.27-rc1 on x86 with gcc-3.2.3 fails with:
In file included from include/asm/dma.h:12,
from include/linux/bootmem.h:8,
from init/main.c:26:
include/asm/io.h: In function `readb':
include/asm/io.h:32: syntax error before string constant
include/asm/io.h: In function `readw':
include/asm/io.h:33: syntax error before string constant
include/asm/io.h: In function `readl':
include/asm/io.h:34: syntax error before string constant
include/asm/io.h: In function `__readb':
include/asm/io.h:36: syntax error before string constant
include/asm/io.h: In function `__readw':
include/asm/io.h:37: syntax error before string constant
include/asm/io.h: In function `__readl':
include/asm/io.h:38: syntax error before string constant
make[1]: *** [init/main.o] Error 1
make: *** [init] Error 2
Starting with 2.6.27-rc1 readb() et al are generated by a
build_mmio_read() macro, which generates asm() statements with
output register constraints like "=" "q", i.e. as two adjacent
string literals. This doesn't work with gcc-3.2.3.
Fixed by moving the "=" part into the callers' reg parameter
(as suggested by Ingo).
Build and boot-tested with gcc-3.2.3 on 32 and 64-bit x86.
Fixes <http://bugzilla.kernel.org/show_bug.cgi?id=11205>.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache. On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.
Relocate the cache flush to be the very last thing done before
halting. Tie into an assembly line so the compile will not
reorder it. Add some documentation explaining what is going
on and why we're doing this.
Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Acked-by: Mark Borden <mark.borden@amd.com>
Acked-by: Michael Hohmuth <michael.hohmuth@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds device table initializations which forbids memory accesses
for devices per default and disables all page faults.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add error check after all calls to acpi_ns_get_pathname_length.
Add status return from acpi_ns_build_external_path and check after
all calls. Add parameter validation to acpi_ut_initialize_buffer.
Reported by and initial patch by Ingo Molnar.
http://lkml.org/lkml/2008/7/21/176
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
the target process if that is not the current process and it is trying to
change its own flags in a different way at the same time.
__capable() is using neither atomic ops nor locking to protect t->flags. This
patch removes __capable() and introduces has_capability() that doesn't set
PF_SUPERPRIV on the process being queried.
This patch further splits security_ptrace() in two:
(1) security_ptrace_may_access(). This passes judgement on whether one
process may access another only (PTRACE_MODE_ATTACH for ptrace() and
PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
current is the parent.
(2) security_ptrace_traceme(). This passes judgement on PTRACE_TRACEME only,
and takes only a pointer to the parent process. current is the child.
In Smack and commoncap, this uses has_capability() to determine whether
the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
This does not set PF_SUPERPRIV.
Two of the instances of __capable() actually only act on current, and so have
been changed to calls to capable().
Of the places that were using __capable():
(1) The OOM killer calls __capable() thrice when weighing the killability of a
process. All of these now use has_capability().
(2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
whether the parent was allowed to trace any process. As mentioned above,
these have been split. For PTRACE_ATTACH and /proc, capable() is now
used, and for PTRACE_TRACEME, has_capability() is used.
(3) cap_safe_nice() only ever saw current, so now uses capable().
(4) smack_setprocattr() rejected accesses to tasks other than current just
after calling __capable(), so the order of these two tests have been
switched and capable() is used instead.
(5) In smack_file_send_sigiotask(), we need to allow privileged processes to
receive SIGIO on files they're manipulating.
(6) In smack_task_wait(), we let a process wait for a privileged process,
whether or not the process doing the waiting is privileged.
I've tested this with the LTP SELinux and syscalls testscripts.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
- remove cheesy read_iloc() function
- move invalidate_entire_icache function to lock.S
- export proper prototypes for functions in lock.S
- only build lock.S when BFIN_ICACHE_LOCK is enabled
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Use platform_data to pass musb configuration-specific
details to musb driver.
This patch will prevent that other platforms selecting
HAVE_CLK and enabling musb won't break tree building.
The other parts of it will come when linux-omap merge
up more omap2/3 board-files.
Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds support for MUSB and TUSB controllers
integrated into omap2430 and davinci. It also adds support
for external tusb6010 controller.
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1121) fixes a bug in the USB serial core. When a device
is unregistered, the core will give back its minors -- even if the
device hasn't been assigned any!
The patch reserves the highest minor value (255) to mean that no minor
was assigned. It also removes some dead code and does a small style
fixup.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1117) adds a kerneldoc line for the "needs_binding"
field in struct usb_interface. It was accidentally omitted when the
field was added.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The patches that are intended to introduce copy-on-write credentials for 2.6.28
require abstraction of access to some fields of the task structure,
particularly for the case of one task accessing another's credentials where RCU
will have to be observed.
Introduced here are trivial no-op versions of the desired accessors for current
and other tasks so that other subsystems can start to be converted over more
easily.
Wrappers are introduced into a new header (linux/cred.h) for UID/GID,
EUID/EGID, SUID/SGID, FSUID/FSGID, cap_effective and current's subscribed
user_struct. These wrappers are macros because the ordering between header
files mitigates against making them inline functions.
linux/cred.h is #included from linux/sched.h.
Further, XFS is modified such that it no longer defines and uses parameterised
versions of current_fs[ug]id(), thus getting rid of the namespace collision
otherwise incurred.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Done as a script (well, a single "git mv" actually) on request from
Yoshinori Sato as a way to avoid a huge diff.
Requested-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wolfgang Walter reported this oops on his via C3 using padlock for
AES-encryption:
##################################################################
BUG: unable to handle kernel NULL pointer dereference at 000001f0
IP: [<c01028c5>] __switch_to+0x30/0x117
*pde = 00000000
Oops: 0002 [#1] PREEMPT
Modules linked in:
Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0
EIP is at __switch_to+0x30/0x117
EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
Call Trace:
[<c03b5b43>] ? schedule+0x285/0x2ff
[<c0131856>] ? pm_qos_requirement+0x3c/0x53
[<c0239f54>] ? acpi_processor_idle+0x0/0x434
[<c01025fe>] ? cpu_idle+0x73/0x7f
[<c03a4dcd>] ? rest_init+0x61/0x63
=======================
Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
around the padlock instructions fix the oops.
Suresh wrote:
These padlock instructions though don't use/touch SSE registers, but it behaves
similar to other SSE instructions. For example, it might cause DNA faults
when cr0.ts is set. While this is a spurious DNA trap, it might cause
oops with the recent fpu code changes.
This is the code sequence that is probably causing this problem:
a) new app is getting exec'd and it is somewhere in between
start_thread() and flush_old_exec() in the load_xyz_binary()
b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
cleared.
c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
routines in the network stack. This generates a math fault (as
cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
in the task's xstate.
d) Return to exec code path, which does start_thread() which does
free_thread_xstate() and sets xstate pointer to NULL while
the TS_USEDFPU is still set.
e) At the next context switch from the new exec'd task to another task,
we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
This can cause an oops during unlazy_fpu() in __switch_to()
Now:
1) This should happen with or with out pre-emption. Viro also encountered
similar problem with out CONFIG_PREEMPT.
2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
kernel_fpu_begin() will manually do a clts() and won't run in to the
situation of setting TS_USEDFPU in step "c" above.
3) This was working before the fpu changes, because its a spurious
math fault which doesn't corrupt any fpu/sse registers and the task's
math state was always in an allocated state.
With out the recent lazy fpu allocation changes, while we don't see oops,
there is a possible race still present in older kernels(for example,
while kernel is using kernel_fpu_begin() in some optimized clear/copy
page and an interrupt/softirq happens which uses these padlock
instructions generating DNA fault).
This is the failing scenario that existed even before the lazy fpu allocation
changes:
0. CPU's TS flag is set
1. kernel using FPU in some optimized copy routine and while doing
kernel_fpu_begin() takes an interrupt just before doing clts()
2. Takes an interrupt and ipsec uses padlock instruction. And we
take a DNA fault as TS flag is still set.
3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts
4. We complete the padlock routine
5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
the optimized copy routine and does kernel_fpu_end(). At this point,
we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
set and not cleared.
6. Now kernel resumes its user operation. And at the next context
switch, kernel sees it has do a FP save as TS_USEDFPU is still set
and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
will take a DNA fault, as cr0.ts is '1' and now, because we are
in __switch_to(), math_state_restore() will get confused and will
restore the next task's FP state and will save it in prev tasks's FP state.
Remember, in __switch_to() we are already on the stack of the next task
but take a DNA fault for the prev task.
This causes the fpu leakage.
Fix the padlock instruction usage by calling them inside the
context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
manually in the interrupt context. This will not generate spurious DNA
in the context of the interrupt which will fix the oops encountered and
the possible FPU leakage issue.
Reported-and-bisected-by: Wolfgang Walter <wolfgang.walter@stwm.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit 74768ed833 "page allocator: use no-panic variant of
alloc_bootmem() in alloc_large_system_hash()" introduced two new
_nopanic macros which are undefined for CONFIG_HAVE_ARCH_BOOTMEM_NODE.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: "Jan Beulich" <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch removes ip6_prohibit_entry and ip6_blk_hole_entry
declarations from include/net/ip6_route.h as they are unused.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes rt6_lock declaration from include/net/ip6_route.h
as it is unused.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based upon a bug report by Andrew Gallatin on netdev
with subject "CPU utilization increased in 2.6.27rc"
In commit 37437bb2e1
("pkt_sched: Schedule qdiscs instead of netdev_queue.")
the test of the queue being stopped was erroneously
removed from qdisc_run().
When the TX queue of the device fills up, this omission
causes lots of extraneous useless work to be queued up
to softirq context, where we'll just return immediately
because the device is still stuffed up.
Signed-off-by: David S. Miller <davem@davemloft.net>
XFS object flushing doesn't quite match existing completion semantics. It
mixed exclusive access with completion. That is, we need to mark an object as
being flushed before flushing it to disk, and then block any other attempt to
flush it until the completion occurs. We do this but adding an extra count to
the completion before we start using them. However, we still need to
determine if there is a completion in progress, and allow no-blocking attempts
fo completions to decrement the count.
To do this we introduce:
int try_wait_for_completion(struct completion *x)
returns a failure status if done == 0, otherwise decrements done
to zero and returns a "started" status. This is provided
to allow counted completions to begin safely while holding
object locks in inverted order.
int completion_done(struct completion *x)
returns 1 if there is no waiter, 0 if there is a waiter
(i.e. a completion in progress).
This replaces the use of semaphores for providing this exclusion
and completion mechanism.
SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31816a
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Various cleanup the drivers/firmware/memmap (after review by AKPM):
- fix kdoc to conform to the standard
- move kdoc from header to implementation files
- remove superfluous WARN_ON() after kmalloc()
- WARN_ON(x); if (!x) -> if(!WARN_ON(x))
- improve some comments
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>