On ARM, debug exceptions occur in the form of data or prefetch aborts.
One difference is that debug exceptions require access to per-cpu banked
registers and data structures which are not saved in the low-level exception
code. For kernels built with CONFIG_PREEMPT, there is an unlikely scenario
that the debug handler ends up running on a different CPU from the one
that originally signalled the event, resulting in random data being read
from the wrong registers.
This patch adds a debug_entry macro to the low-level exception handling
code which checks whether the taken exception is a debug exception. If
it is, the preempt count for the faulting process is incremented. After
the debug handler has finished, the count is decremented.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The current hw_breakpoint code tries to fix up the alignment of
breakpoints so that we can make use of sparse byte-address-select
bits in the control register and give the illusion that we can
set breakpoints on unaligned addresses.
Although this works on v6 cores, v7 forbids this behaviour, instead
requiring breakpoints to be set on aligned addresses and have contiguous
byte-address-select ranges depending on the instruction set in use.
For ARM the only supported size is 4 bytes, whilst Thumb-2 also permits
2 byte breakpoints (watchpoints can be of 1, 2, 4 or 8 bytes long).
This patch simplifies the alignment fixup code so that we require
addresses to be aligned to the size of the corresponding breakpoint.
This allows us to handle the common case of breaking on a half-word
aligned Thumb-2 instruction and also allows us to set byte watchpoints
on arbitrary addresses.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARMv7 debug architecture doesn't make any guarantees about the
contents of debug control registers following a debug logic reset.
This patch ensures that we reset the control registers when a cpu
comes ONLINE (for example, with hotplug) so that when we enable
monitor mode while inserting a breakpoint we won't exhibit random
behaviour.
Signed-off-by: Will Deacon <will.deacon@arm.com>
ARMv7 architects a system for saving and restoring the debug registers
across low-power modes. At the heart of this system is a lock register
which, when set, forbids writes to the debug registers. While locked,
writes to debug registers via the co-processor interface will result
in undefined instruction traps. Linux currently doesn't make use of
this feature because we update the debug registers on context switch
anyway, however the status of the lock is IMPLEMENTATION DEFINED on
reset.
This patch ensures that the lock is cleared during boot so that we
can write to the debug registers safely.
Signed-off-by: Will Deacon <will.deacon@arm.com>
As our SMP implementation uses MESI protocols. Grouping together data
which is mostly only read together means that we avoid unnecessary
cache line bouncing when this code shares a cache line with other data.
In other words, cache lines associated with read-mostly data are
expected to spend most of their time in shared state.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
For kernels built with PREEMPT_RT, critical sections protected
by standard spinlocks are preemptible. This is not acceptable
on perf as (a) we may be scheduled onto a different CPU whilst
reading/writing banked PMU registers and (b) the latency when
reading the PMU registers becomes unpredictable.
This patch upgrades the pmu_lock spinlock to a raw_spinlock
instead.
Reported-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell reported a number of warnings coming from sparse when
checking the ARM perf_event.c files:
| perf_event.c seems to also have problems too:
|
| CHECK arch/arm/kernel/perf_event.c
| arch/arm/kernel/perf_event.c:37:1: warning: symbol 'pmu_lock' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:70:1: warning: symbol 'cpu_hw_events' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:1006:1: warning: symbol 'armv6pmu_enable_event' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:1113:1: warning: symbol 'armv6pmu_stop' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:1956:6: warning: symbol 'armv7pmu_enable_event' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:3072:14: warning: incorrect type in argument 1 (different address spaces)
| arch/arm/kernel/perf_event.c:3072:14: expected void const volatile [noderef] <asn:1>*<noident>
| arch/arm/kernel/perf_event.c:3072:14: got struct frame_tail *tail
| arch/arm/kernel/perf_event.c:3074:49: warning: incorrect type in argument 2 (different address spaces)
| arch/arm/kernel/perf_event.c:3074:49: expected void const [noderef] <asn:1>*from
| arch/arm/kernel/perf_event.c:3074:49: got struct frame_tail *tail
This patch resolves these issues so we can live in silence
again.
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When kexec is used to start a crash kernel the other cores
are notified. These non-crashing cores will save their state
in the crash notes and then do nothing.
Signed-off-by: Per Fransson <per.xx.fransson@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The existing code invokes the syscall with rubbish in r7,
due to what looks like an incorrect literal load idiom.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As we've now removed the spinlock and bitmask, we have nothing left
which requires interrupts to be disabled when sending an IPI. All
current IPI-sending implementations use the GIC, which also does not
require interrupts disabled when calling gic_raise_softirq().
Remove the now unnecessary IRQ disable.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Avoid using bitmasks and locks in the percpu area for IPIs, and instead
use individual software generated interrupts to identify the reason for
the IPI. This avoids the problems of having spinlocks in the percpu
area.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows us to use smp_cross_call() to trigger a number of different
software generated interrupts, rather than combining them all on one
SGI. Recover the SGI number via do_IPI.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If a section is not marked with SHF_ALLOC, it will be discarded
by the module code. Therefore, it is not correct to register
the unwind tables.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There's no need to keep pointers to the ELF sections available while
the module is loaded - we only need the section pointers while we're
finding and registering the unwind tables, which can all be done during
the finalize stage of loading.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The 32-bit conditional branches in Thumb-2 have a shorter range
(+/-512K) than their ARM counterparts (+/-32MB). The linker does
not currently generate trampolines to extend the range of these
Thumb-2 conditional branches, resulting in link errors when vmlinux
is sufficiently large, e.g.:
head.o:(.text+0x464): relocation truncated to fit: R_ARM_THM_JUMP19
This patch forces the longer-range, unconditional branch encoding
by use of an explicit IT instruction. The resulting branches are
triggered on the same conditions as before.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Directives such as .long and .word do not magically cause the
assembler location counter to become aligned in gas. As a result,
using these directives in code sections can result in misaligned
data words when building a Thumb-2 kernel (CONFIG_THUMB2_KERNEL).
This is a Bad Thing, since the ABI permits the compiler to assume
that fundamental types of word size or above are word- aligned when
accessing them from C. If the data is not really word-aligned,
this can cause impaired performance and stray alignment faults in
some circumstances.
In general, the following rules should be applied when using data
word declaration directives inside code sections:
* .quad and .double:
.align 3
* .long, .word, .single, .float:
.align (or .align 2)
* .short:
No explicit alignment required, since Thumb-2
instructions are always 2 or 4 bytes in size.
immediately after an instruction.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Directives such as .long and .word do not magically cause the
assembler location counter to become aligned in gas. As a result,
using these directives in code sections can result in misaligned
data words when building a Thumb-2 kernel (CONFIG_THUMB2_KERNEL).
This is a Bad Thing, since the ABI permits the compiler to assume
that fundamental types of word size or above are word- aligned when
accessing them from C. If the data is not really word-aligned,
this can cause impaired performance and stray alignment faults in
some circumstances.
In general, the following rules should be applied when using data
word declaration directives inside code sections:
* .quad and .double:
.align 3
* .long, .word, .single, .float:
.align (or .align 2)
* .short:
No explicit alignment required, since Thumb-2
instructions are always 2 or 4 bytes in size.
immediately after an instruction.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than passing the pte value to __pte_error, pass the raw pte_t
cookie instead. Do the same for pmd and pgd functions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The perf hardware pmu got initialized at various points in the boot,
some before early_initcall() some after (notably arch_initcall).
The problem is that the NMI lockup detector is ran from early_initcall()
and expects the hardware pmu to be present.
Sanitize this by moving all architecture hardware pmu implementations to
initialize at early_initcall() and move the lockup detector to an explicit
initcall right after that.
Cc: paulus <paulus@samba.org>
Cc: davem <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290707759.2145.119.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
swp_emulate is only used on ARMv7+, and includes ARMv7+ assembly
instructions. Allow the assembler to accept ARMv7 instructions,
but leave the compiler's code generation options alone.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The ARM perf_event.c file contains all PMU backends and, as new PMUs
are introduced, will continue to grow.
This patch follows the example of x86 and splits the PMU implementations
into separate files which are then #included back into the main
file. Compile-time guards are added to each PMU file to avoid compiling
in code that is not relevant for the version of the architecture which
we are targetting.
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, perf uses the PMU ID as an index into a string table
to look up the name of a given PMU.
This patch encodes the name of a PMU directly into the arm_pmu
structure so that PMU-specific code can be factored out into
separate files.
Acked-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In preparation for separating the PMU-specific code, this patch adds
self-contained init functions to each PMU, therefore removing any
PMU-specific knowledge from the PMU-agnostic init_hw_perf_events
function.
Acked-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Unlike other pmu functions, armv6pmu_pmu_stop is not declared static.
This patch adds the missing keyword.
Acked-by: Jamie Iles <jamie.iles@jamieiles.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The functions for mapping PMU events (perf, cache and raw) are
common between all PMU types and differ only in the data on which
they operate.
This patch implements common definitions of these mapping functions
and changes the arm_pmu struct to hold pointers to the data which
they require. This is in anticipation of separating out the PMU-specific
code into separate files.
Acked-by: Jamie Iles <jamie.iles@jamieiles.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Tim Bird <tim.bird@am.sony.com>
[rabin@rab.in: rebase on top of latest code,
keep code in ftrace.c instead of separate file,
check for ftrace_graph_entry also]
Signed-off-by: Rabin Vincent <rabin@rab.in>
Use assembler macros to avoid copy/pasting code between the
implementations of the two variants of the mcount call.
Signed-off-by: Rabin Vincent <rabin@rab.in>
When FUNCTION_GRAPH_TRACER is enabled, place do_IRQ() and friends in the
IRQ_ENTRY section so that the irq-related features of the function graph
tracer work.
Signed-off-by: Rabin Vincent <rabin@rab.in>
armv7_pmnc_counter_has_overflowed can return uninitialised data
if an invalid counter is specified.
This patch fixes the code to return 0 in this case, which squashes
the compiler warning from GCC 4.5.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When unwinding stack frames we must take care not to unwind
areas of memory that lie outside of the known extent of the stack.
This patch fixes an incorrect calculation of the stack base where
THREAD_SIZE is added to the stack pointer after it has already
been aligned to this value. Since the ALIGN macro performs this
addition internally, we end up overshooting the base by 8k.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The SWP instruction was deprecated in the ARMv6 architecture,
superseded by the LDREX/STREX family of instructions for
load-linked/store-conditional operations. The ARMv7 multiprocessing
extensions mandate that SWP/SWPB instructions are treated as undefined
from reset, with the ability to enable them through the System Control
Register SW bit.
This patch adds the alternative solution to emulate the SWP and SWPB
instructions using LDREX/STREX sequences, and log statistics to
/proc/cpu/swp_emulation. To correctly deal with copy-on-write, it also
modifies cpu_v7_set_pte_ext to change the mappings to priviliged RO when
user RO.
Signed-off-by: Leif Lindholm <leif.lindholm@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch removes the domain switching functionality via the set_fs and
__switch_to functions on cores that have a TLS register.
Currently, the ioremap and vmalloc areas share the same level 1 page
tables and therefore have the same domain (DOMAIN_KERNEL). When the
kernel domain is modified from Client to Manager (via the __set_fs or in
the __switch_to function), the XN (eXecute Never) bit is overridden and
newer CPUs can speculatively prefetch the ioremap'ed memory.
Linux performs the kernel domain switching to allow user-specific
functions (copy_to/from_user, get/put_user etc.) to access kernel
memory. In order for these functions to work with the kernel domain set
to Client, the patch modifies the LDRT/STRT and related instructions to
the LDR/STR ones.
The user pages access rights are also modified for kernel read-only
access rather than read/write so that the copy-on-write mechanism still
works. CPU_USE_DOMAINS gets disabled only if the hardware has a TLS register
(CPU_32v6K is defined) since writing the TLS value to the high vectors page
isn't possible.
The user addresses passed to the kernel are checked by the access_ok()
function so that they do not point to the kernel space.
Tested-by: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
DBG_MAX_REG_NUM incorrectly had the number of indices in the GDB regs
array rather than the number of registers, leading to an oops when the
"rd" command is used in KDB.
Cc: stable@kernel.org
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
use new 'datap' variable in order to remove unnecessary castings.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix up the arguments to arch_ptrace() to take account of the fact that
@addr and @data are now unsigned long rather than long as of a preceding
patch in this series.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 5085f3ff45 added better support for
CONFIG_HOTPLUG_CPU by keeping proc_info around. However, depending on
the Kconfig options selected, this can make the booting fail mysteriously
early on.
Turns out a data abort can happen in __lookup_processor in ldmia r5 {r3, r4}.
When it happens the address loaded to r5 is not aligned. Fix the problem by
aligning proc_info.
Reported-by: Anand Gadiyar <gadiyar@ti.com>
Tested-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
kexec does not disable the outer cache before disabling the inner
caches in cpu_proc_fin(). So L2 is enabled across the kexec jump. When
the new kernel enables chaches again, it randomly crashes.
Disabling L2 before calling cpu_proc_fin() cures the problem.
Disabling L2 requires the following new functions: flush_all(),
inv_all() and disable(). Add them to outer_cache_fns and call them
from the kexec code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Since we're now using addruart to establish the debug mapping, we can
remove the io_pg_offst and phys_io members of struct machine_desc.
The various declarations were removed using the following script:
grep -rl MACHINE_START arch/arm | xargs \
sed -i '/MACHINE_START/,/MACHINE_END/ { /\.\(phys_io\|io_pg_offst\)/d }'
[ Initial patch was from Jeremy Kerr, example script from Russell King ]
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Eric Miao <eric.miao at canonical.com>
Since we can get both physical and virtual addresses from the addruart
macro, we can use this to establish the debug mappings.
In the case of CONFIG_DEBUG_ICEDCC, we don't need any mappings, but
may still need to setup r7 correctly.
Incorporating ASM changes from Nicolas Pitre <npitre@fluxnic.net>.
Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Tested-by: Kevin Hilman <khilman@deeprootsystems.com>
Rather than checking the MMU status in every instance of addruart, do it
once in kernel/debug.S, and change the existing addruart macros to
return both physical and virtual addresses. The main debug code can then
select the appropriate address to use.
This will also allow us to retreive the address of a uart for the MMU
state that we're not current in.
Updated with fixes for OMAP from Jason Wang <jason77.wang@gmail.com>
and Tony Lindgren <tony@atomide.com>, and fix for versatile express from
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>.
Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Jason Wang <jason77.wang@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Kevin Hilman <khilman@deeprootsystems.com>
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.
Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.
The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.
Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The core code now initializes the requested number of interrupts and
sets the flags in irq_desc.status which are requested by the
architecture via ARCH_IRQ_INIT_FLAGS.
Add ARCH_IRQ_INIT_FLAGS and remove the loop which sets those flags
after the irq descriptors are allocated.
[ This patch should have been in the original irq rework and got
dropped accidentaly ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Cc: Anand Gadiyar <gadiyar@ti.com>
Commit b683de2b3 in linux-next as of 20101014 (genirq: Query
arch for number of early descriptors) seems to have broken
bootup on several ARM boards - my beagleboard gives the
following dump with earlyprintk:
NR_IRQS:402
Unable to handle kernel NULL pointer dereference at virtual
address 00000028 pgd = c0004000
[00000028] *pgd=00000000
Internal error: Oops: 5 [#1]
last sysfs file:
Modules linked in:
CPU: 0 Not tainted
(2.6.36-rc7-next-20101014-linux-next-20101012+ #40) PC is at
init_IRQ+0x14/0x48 LR is at start_kernel+0x150/0x2c0
[...]
We seem to be using desc->status without assigning desc to
anything. Fix this by adding back the code that was originally
there.
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Tested-by: Linus Walleij <linus.walleij@stericsson.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
LKML-Reference: <1287077397-21781-1-git-send-email-gadiyar@ti.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.
The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.
New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time. Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.
The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.
Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.
Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.
===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
// but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}
@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}
@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}
@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}
@ fops0 @
identifier fops;
@@
struct file_operations fops = {
...
};
@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
.llseek = llseek_f,
...
};
@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
.read = read_f,
...
};
@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
.write = write_f,
...
};
@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
.open = open_f,
...
};
// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
... .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};
@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
... .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};
// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
... .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};
// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};
// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};
@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+ .llseek = default_llseek, /* write accesses f_pos */
};
// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////
@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
.write = write_f,
.read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};
@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};
@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};
@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Fix this linux-next build failure that Stephen reported:
arch/arm/kernel/perf_event.c: In function 'armpmu_event_init':
arch/arm/kernel/perf_event.c:543: error: request for member 'num_events' in something not a structure or union
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
LKML-Reference: <20101014164925.4fa16b75.sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>