of_device_ids (i.e. compatible strings and the respective data) are not
supposed to change at runtime. All functions working with of_device_ids
provided by <linux/of.h> work with const of_device_ids. So mark the
non-const structs in arch/arm as const, too.
While at it also add some __initconst annotations.
Acked-by: Jason Cooper <jason@lakedameon.net>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
For instrumenting global variables KASan will shadow memory backing memory
for modules. So on module loading we will need to allocate memory for
shadow and map it at address in shadow that corresponds to the address
allocated in module_alloc().
__vmalloc_node_range() could be used for this purpose, except it puts a
guard hole after allocated area. Guard hole in shadow memory should be a
problem because at some future point we might need to have a shadow memory
at address occupied by guard hole. So we could fail to allocate shadow
for module_alloc().
Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into
__vmalloc_node_range(). Add new parameter 'vm_flags' to
__vmalloc_node_range() function.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If an attacker can cause a controlled kernel stack overflow, overwriting
the restart block is a very juicy exploit target. This is because the
restart_block is held in the same memory allocation as the kernel stack.
Moving the restart block to struct task_struct prevents this exploit by
making the restart_block harder to locate.
Note that there are other fields in thread_info that are also easy
targets, at least on some architectures.
It's also a decent simplification, since the restart code is more or less
identical on all architectures.
[james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: Richard Weinberger <richard@nod.at>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some platforms always enter the kernel in ARM mode even if the
kernel is compiled for THUMB2. Add a small wrapper on top of
secondary_startup() that switches into THUMB2 mode.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use common resource list management data structure and interfaces
instead of private implementation.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Minimal builds for v7M are broken when printk is disabled. The caller is
assembly so add the necessary ifdef around the call.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This line converts PHYS_OFFSET into PHYS_PFN_OFFSET.
It is better to use PAGE_SHIFT rather than the magic number 12.
Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As of commit abda1bd5f4 __cpu_suspend()
takes only 2 arguments, and those arguments are passed by the platform
code. This comment thus makes no sense, as cpu_suspend() is not actually
hiding any arguments.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The current rounding of "size" is wrong:
- If "start" is sufficiently near the next page boundary, "size"
is decremented by more than enough and the last page is lost.
- If "size" is sufficiently small, it is wrapped around and gets
a bogus value.
Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Certain platforms (i.e. Exynos) might need to set .write_sec callback
from firmware initialization which is happenning in .init_early callback
of machine descriptor. However current code will overwrite the pointer
with whatever is present in machine descriptor, even though it can be
already set earlier. This patch fixes this by making the assignment
conditional, depending on whether current .write_sec callback is NULL.
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Nishanth Menon <nm@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch introduce kprobeopt for ARM 32.
Limitations:
- Currently only kernel compiled with ARM ISA is supported.
- Offset between probe point and optinsn slot must not larger than
32MiB. Masami Hiramatsu suggests replacing 2 words, it will make
things complex. Futher patch can make such optimization.
Kprobe opt on ARM is relatively simpler than kprobe opt on x86 because
ARM instruction is always 4 bytes aligned and 4 bytes long. This patch
replace probed instruction by a 'b', branch to trampoline code and then
calls optimized_callback(). optimized_callback() calls opt_pre_handler()
to execute kprobe handler. It also emulate/simulate replaced instruction.
When unregistering kprobe, the deferred manner of unoptimizer may leave
branch instruction before optimizer is called. Different from x86_64,
which only copy the probed insn after optprobe_template_end and
reexecute them, this patch call singlestep to emulate/simulate the insn
directly. Futher patch can optimize this behavior.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jon Medhurst (Tixy) <tixy@linaro.org>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
This patch prohibits probing instructions for which the stack
requirements are unable to be determined statically. Some test cases
are found not work again after the modification, this patch also
removes them.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reviewed-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
If the overflow threshold for a counter is set above or near the
0xffffffff boundary then the kernel may lose track of the overflow
causing only events that occur *after* the overflow to be recorded.
Specifically the problem occurs when the value of the performance counter
overtakes its original programmed value due to wrap around.
Typical solutions to this problem are either to avoid programming in
values likely to be overtaken or to treat the overflow bit as the 33rd
bit of the counter.
Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
during irqsave sections (context switches for example) so instead we take
the simpler approach of avoiding values likely to be overtaken.
We set the limit to half of max_period because this matches the limit
imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
rate for large threshold values, however even with a very fast counter
ticking at 4GHz the interrupt rate would only be ~1Hz.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Currently restore_user_regs deallocates the SVC stack early in
its execution and relies on no exception being taken between
the deallocation and the registers being restored. The introduction
of a default FIQ handler that also uses the SVC stack breaks this
assumption and can result in corrupted register state.
This patch works around the problem by removing the early
stack deallocation and using r2 as a temporary instead. I have
not found a way to do this without introducing an extra mov
instruction to the macro.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On x86_64, at least, task_pt_regs may be only partially initialized
in many contexts, so x86_64 should not use it without extra care
from interrupt context, let alone NMI context.
This will allow x86_64 to override the logic and will supply some
scratch space to use to make a cleaner copy of user regs.
Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: chenggang.qcg@taobao.com
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/e431cd4c18c2e1c44c774f10758527fb2d1025c4.1420396372.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Most if not all ARM PCI host controller device drivers either ignore the
domain field in the pci_sys_data structure or just increment it every time
a host controller is probed, using it as a domain counter.
Therefore, instead of relying on pci_sys_data to stash the domain number in
a standard location, ARM pcibios code can be moved to the newly introduced
generic PCI domains code, implemented in commits:
41e5c0f81d ("of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr()")
670ba0c888 ("PCI: Add generic domain handling")
ARM code is made to select PCI_DOMAINS_GENERIC by default, which builds
core PCI code that assigns the domain number through the generic function:
void pci_bus_assign_domain_nr(...)
that relies on a DT property to define the domain number or falls back to a
counter according to a predefined logic; its usage replaces the current
domain assignment code in PCI host controllers present in the kernel.
Tested-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> # mvebu
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Phil Edworthy <phil.edworthy@renesas.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
CC: Mohit Kumar <mohit.kumar@st.com>
After commit b2b49ccbdd (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks
depending on CONFIG_PM_RUNTIME may now be changed to depend on
CONFIG_PM.
Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere in the code under
arch/arm/ (the defconfig files will be modified later).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Nishanth Menon <nm@ti.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Two files that get included when building the multi_v7_defconfig target
fail to build when selecting THUMB2_KERNEL for this configuration.
In both cases, we can just build the file as ARM code, as none of its
symbols are exported to modules, so there are no interworking concerns.
In the iwmmxt.S case, add ENDPROC() declarations so the symbols are
annotated as functions, resulting in the linker to emit the appropriate
mode switches.
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If the kernel is running in hypervisor mode or monitor mode we'll
print UK6_32 or UK10_32 if we call into __show_regs(). Let's
update these strings to indicate the new modes that didn't exist
when this code was written.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We need to ensure that the IOMMUs in the system have a chance to perform
some basic initialisation before we start adding masters to them.
This patch adds a call to of_iommu_init before of_platform_populate.
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We cannot restart cacheflush safely if a process provides user-defined
signal handler and signal is pending. In this case -EINTR is returned
and it is expected that process re-invokes syscall. However, there are
a few problems with that:
* looks like nobody bothers checking return value from cacheflush
* but if it did, we don't provide the restart address for that, so the
process has to use the same range again
* ...and again, what might lead to looping forever
So, remove cacheflush restarting code and terminate cache flushing
as early as fatal signal is pending.
Cc: stable@vger.kernel.org # 3.12+
Reported-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There are no users of the struct hw_pci.add_bus() or .remove_bus() methods,
so remove the pointers from hw_pci. That makes pcibios_add_bus() and
pcibios_remove_bus() themselves superfluous, so remove them as well.
[bhelgaas: changelog]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently ARM associates an MSI controller with a PCI bus by defining
pcibios_add_bus() and using it to call a struct hw_pci.add_bus() method.
That method sets the struct pci_bus "msi" member. That's unwieldy and
unnecessarily couples MSI with the PCI enumeration code.
On ARM, all devices under the same PCI host bridge share an MSI controller,
so add an msi_controller pointer to the struct pci_sys_data and implement
pcibios_msi_controller() to retrieve it.
This is a step toward moving the msi_controller pointer into the generic
struct pci_host_bridge.
[bhelgaas: changelog, take pci_dev instead of pci_bus]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The ftrace assembly code doesn't need to live in entry-common.S and
be surrounded with #ifdef CONFIG_FUNCTION_TRACER. Instead, move it
to its own file and conditionally assemble it.
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The generic dump_stack() code provides the facility to include the
machine name in the stack dump, which can be useful information. Add
a call to dump_stack_set_arch_desc() for the generic code to print
this information.
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The "SMP: Total of %d processors activated." message which we print in
smp_cpus_done() provides no further information than the message in
genreic code in smp_announce(). Kill it.
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Drop the "CPUn: Booted secondary processor" message from info to debug
level. We later print how many CPUs came online, so listing each one
is redundant, and when using hotplug, can be quite noisy.
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than open coding the printk_ratelimit() check with pr_warn(), use
pr_warn_ratelimited() instead.
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Convert many (but not all) printk(KERN_* to pr_* to simplify the code.
We take the opportunity to join some printk lines together so we don't
split the message across several lines, and we also add a few levels
to some messages which were previously missing them.
Tested-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Kernel-side struct msghdr is (currently) using the same layout as
userland one, but it's not a one-to-one copy - even without considering
32bit compat issues, we have msg_iov, msg_name and msg_control copied
to kernel[1]. It's fairly localized, so we get away with a few functions
where that knowledge is needed (and we could shrink that set even
more). Pretty much everything deals with the kernel-side variant and
the few places that want userland one just use a bunch of force-casts
to paper over the differences.
The thing is, kernel-side definition of struct msghdr is *not* exposed
in include/uapi - libc doesn't see it, etc. So we can add struct user_msghdr,
with proper annotations and let the few places that ever deal with those
beasts use it for userland pointers. Saner typechecking aside, that will
allow to change the layout of kernel-side msghdr - e.g. replace
msg_iov/msg_iovlen there with struct iov_iter, getting rid of the need
to modify the iovec as we copy data to/from it, etc.
We could introduce kernel_msghdr instead, but that would create much more
noise - the absolute majority of the instances would need to have the
type switched to kernel_msghdr and definition of struct msghdr in
include/linux/socket.h is not going to be seen by userland anyway.
This commit just introduces user_msghdr and switches the few places that
are dealing with userland-side msghdr to it.
[1] actually, it's even trickier than that - we copy msg_control for
sendmsg, but keep the userland address on recvmsg.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch remove clear_thread_flag(TIF_UPROBE) in do_work_pending(),
because uprobe_notify_resume() have do this.
Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use the global current_stack_pointer to get the value of the stack pointer.
This change supports being able to compile the kernel with both gcc and clang.
Signed-off-by: Behan Webster <behanw@converseincode.com>
Reviewed-by: Mark Charlebois <charlebm@gmail.com>
Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use the global current_stack_pointer to get the value of the stack pointer.
This change supports being able to compile the kernel with both gcc and clang.
Signed-off-by: Behan Webster <behanw@converseincode.com>
Reviewed-by: Mark Charlebois <charlebm@gmail.com>
Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use the global current_stack_pointer to get the value of the stack pointer.
This change supports being able to compile the kernel with both gcc and Clang.
Signed-off-by: Behan Webster <behanw@converseincode.com>
Reviewed-by: Mark Charlebois <charlebm@gmail.com>
Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Removing minimal support for etb/etm to favour an implementation
that is more flexible, extensible and capable of handling more
platforms.
Also removing the only client of the old driver. That code can
easily be replaced by entries for etb/etm in the device tree.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Handling multiple PMUs using a single hotplug notifier requires a list
of PMUs to be maintained, with synchronisation in the probe, remove, and
notify paths. This is error-prone and makes the code much harder to
maintain.
Instead of using a single notifier, we can dynamically allocate a
notifier block per-PMU. The end result is the same, but the list of PMUs
is implicit in the hotplug notifier list rather than within a perf-local
data structure, which makes the code far easier to handle.
Signed-off-by: Mark Rutland <mark.rutland at arm.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
To support multiple PMUs, each PMU will need its own accounting data.
As we don't know how (in general) many PMUs we'll have to support at
compile-time, we must allocate the data at runtime dynamically
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently the percpu_pmu pointers used as percpu_irq dev_id values are
defined separately from the other per-cpu accounting data, which make
dynamically allocating the data (as will be required for systems with
heterogeneous CPUs) difficult.
This patch moves the percpu_pmu pointers into pmu_hw_events (which is
itself allocated per cpu), which will allow for easier dynamic
allocation. Both percpu and regular irqs are requested using percpu_pmu
pointers as tokens, freeing us from having to know whether an irq is
percpu within the handler, and thus avoiding a radix tree lookup on the
handler path.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that the arm pmu code is limited to CPU PMUs the get_hw_events()
function is superfluous, as we'll always have a set of per-cpu
pmu_hw_events structures.
This patch removes the get_hw_events() function, replacing it with
a percpu hw_events pointer. Uses of get_hw_events are updated to use
this_cpu_ptr.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 3fc2c83087 (ARM: perf: remove event limit from pmu_hw_events) got
rid of the upper limit on the number of events an arm_pmu could handle,
but introduced additional complexity and places a burden on each PMU
driver to allocate accounting data somehow. So far this has not
generally been useful as the only users of arm_pmu are the CPU backend
and the CCI driver.
Now that the CCI driver plugs into the perf subsystem directly, we can
remove some of the complexities that get in the way of supporting
heterogeneous CPU PMUs.
This patch restores the original limits on pmu_hw_events fields such
that the pmu_hw_events data can be allocated as a contiguous block. This
will simplify dynamic pmu_hw_events allocation in later patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
For systems with heterogeneous CPUs (e.g. big.LITTLE systems) the PMUs
can be different in each cluster, and not all events can be migrated
between clusters. To allow userspace to deal with this, it must be
possible to address each PMU independently.
This patch changes PMUs to be registered with dynamic (IDR) types,
allowing them to be targeted individually. Each PMU's type can be found
in ${SYSFS_ROOT}/bus/event_source/devices/${PMU_NAME}/type.
From userspace, raw events can be targeted at a specific PMU:
$ perf stat -e ${PMU_NAME}/config=V,config1=V1,.../
Doing this does not break existing tools which use existing perf types:
when perf core can't find a PMU of matching type (in perf_init_event)
it'll iterate over the set of all PMUs. If a compatible PMU exists,
it'll be found eventually. If more than one compatible PMU exists, the
event will be handled by whichever PMU happens to be earlier in the pmus
list (which currently will be the last compatible PMU registered).
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The current PMU probing logic consists of a single switch statement,
which means that the core arm_pmu core in perf_event_cpu.c needs to know
about every CPU PMU variant supported by a driver using the arm_pmu
framework. This makes it rather difficult to decouple the drivers from
the (otherwise generic) probing code.
The patch refactors that switch statement to a table-driven lookup,
separating the logic and knowledge (in the form of the table). Later
patches will split the table across the relevant PMU drivers, which can
pass their tables to the generic probing function.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Most of the pr_info format strings in perf_event_cpu.c are missing
newlines. Currently we get away with this as the format strings for
subsequent calls to printk (including all pr_* calls) begin with a log
prefix, and the printk core adds the omitted newline for this case.
While generates the output we expect, we probably should not rely on the
format of successive printk calls in order to get legible output.
This patch adds the missing newlines to pr_info format strings in
perf_event_cpu.c, making them consistent with the format strings for
other pr_info, warn, and pr_err calls, and preventing potentially
illegible output if the next printk/pr_* format string doesn't begin
with a log prefix.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM callchain handling code is currently bundled with the ARM PMU
management code, despite the two having no dependency on each other.
This bundling has the unfortunate property of making callchain handling
depend on CONFIG_HW_PERF_EVENTS, even though the callchain handling
could be applied to software events in the absence of PMU hardware
support.
This patch separates the two, placing the callchain handling in
perf_callchain.c and making it depend on CONFIG_PERF_EVENTS rather than
CONFIG_HW_PERF_EVENTS, enabling callchain recording on kernels built
without hardware perf event support.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There are a few remaining uses of printk in the ARM perf code, so move
them over to the pr_* variants instead.
Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Idx sanity check was once implemented separately in these counter
handling functions and then return value was treated as a judgement.
armv7_pmnc_select_counter()
armv7_pmnc_enable_counter()
armv7_pmnc_disable_counter()
armv7_pmnc_enable_intens()
armv7_pmnc_disable_intens()
But we do not need to do this now, as idx validation check was moved
out all these functions by commit 7279adbd9bb8ef8f(ARM: perf: check ARMv7
counter validity on a per-pmu basis).
Let's remove the useless return of idx from these functions.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>