Add kprobe_insn_mutex for protecting kprobe_insn_pages hlist, and remove
kprobe_mutex from architecture dependent code.
This allows us to call arch_remove_kprobe() (and free_insn_slot) while
holding kprobe_mutex.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The atomic_t type cannot currently be used in some header files because it
would create an include loop with asm/atomic.h. Move the type definition
to linux/types.h to break the loop.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Show node to memory section relationship with symlinks in sysfs
Add /sys/devices/system/node/nodeX/memoryY symlinks for all
the memory sections located on nodeX. For example:
/sys/devices/system/node/node1/memory135 -> ../../memory/memory135
indicates that memory section 135 resides on node1.
Also revises documentation to cover this change as well as updating
Documentation/ABI/testing/sysfs-devices-memory to include descriptions
of memory hotremove files 'phys_device', 'phys_index', and 'state'
that were previously not described there.
In addition to it always being a good policy to provide users with
the maximum possible amount of physical location information for
resources that can be hot-added and/or hot-removed, the following
are some (but likely not all) of the user benefits provided by
this change.
Immediate:
- Provides information needed to determine the specific node
on which a defective DIMM is located. This will reduce system
downtime when the node or defective DIMM is swapped out.
- Prevents unintended onlining of a memory section that was
previously offlined due to a defective DIMM. This could happen
during node hot-add when the user or node hot-add assist script
onlines _all_ offlined sections due to user or script inability
to identify the specific memory sections located on the hot-added
node. The consequences of reintroducing the defective memory
could be ugly.
- Provides information needed to vary the amount and distribution
of memory on specific nodes for testing or debugging purposes.
Future:
- Will provide information needed to identify the memory
sections that need to be offlined prior to physical removal
of a specific node.
Symlink creation during boot was tested on 2-node x86_64, 2-node
ppc64, and 2-node ia64 systems. Symlink creation during physical
memory hot-add tested on a 2-node x86_64 system.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: build fix on ia64
ia64's default_affinity_write() still had old cpumask_t usage:
/home/mingo/tip/kernel/irq/proc.c: In function `default_affinity_write':
/home/mingo/tip/kernel/irq/proc.c:114: error: incompatible type for argument 1 of `is_affinity_mask_valid'
make[3]: *** [kernel/irq/proc.o] Error 1
make[3]: *** Waiting for unfinished jobs....
update it to cpumask_var_t.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
These two IOMMUs can implement the current version of this API. So
select the API if one or both of these IOMMU drivers is selected.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Impact: file renamed
The code in the vtd.c file can be reused for other IOMMUs as well. So
rename it to make it clear that it handle more than VT-d.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Impact: New API
The old topology_core_siblings() and topology_thread_siblings() return
a cpumask_t; these new ones return a (const) struct cpumask *.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Fix kvm_arch_vcpu_ioctl_[gs]et_regs() to do something meaningful on
ia64. Old versions could never have worked since they required
pointers to be set in the ioctl payload which were never being set by
the ioctl handler for get_regs.
In addition reserve extra space for future extensions.
The change of layout of struct kvm_regs doesn't require adding a new
CAP since get/set regs never worked on ia64 until now.
This version doesn't support copying the KVM kernel stack in/out of
the kernel. This should be implemented in a seperate ioctl call if
ever needed.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by : Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Since vmm runs in an isolated address space and it is just a copy
of host's kvm-intel module, so once vmm crashes, we just crash all guests
running on it instead of crashing whole kernel.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Use printk infrastructure to print out some debug info once VM crashes.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
kvm-intel module is relocated to an isolated address space
with kernel, so it can't call host kernel's printk for debug
purpose. In the module, we implement the printk to output debug
info of vmm.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Remove the lock protection for kvm halt logic, otherwise,
once other vcpus want to acquire the lock, and they have to
wait all vcpus are waken up from halt.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The cpu time spent by the idle process actually doing something is
currently accounted as idle time. This is plain wrong, the architectures
that support VIRT_CPU_ACCOUNTING=y can do better: distinguish between the
time spent doing nothing and the time spent by idle doing work. The first
is accounted with account_idle_time and the second with account_system_time.
The architectures that use the account_xxx_time interface directly and not
the account_xxx_ticks interface now need to do the check for the idle
process in their arch code. In particular to improve the system vs true
idle time accounting the arch code needs to measure the true idle time
instead of just testing for the idle process.
To improve the tick based accounting as well we would need an architecture
primitive that can tell us if the pt_regs of the interrupted context
points to the magic instruction that halts the cpu.
In addition idle time is no more added to the stime of the idle process.
This field now contains the system time of the idle process as it should
be. On systems without VIRT_CPU_ACCOUNTING this will always be zero as
every tick that occurs while idle is running will be accounted as idle
time.
This patch contains the necessary common code changes to be able to
distinguish idle system time and true idle time. The architectures with
support for VIRT_CPU_ACCOUNTING need some changes to exploit this.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The utimescaled / stimescaled fields in the task structure and the
global cpustat should be set on all architectures. On s390 the calls
to account_user_time_scaled and account_system_time_scaled never have
been added. In addition system time that is accounted as guest time
to the user time of a process is accounted to the scaled system time
instead of the scaled user time.
To fix the bugs and to prevent future forgetfulness this patch merges
account_system_time_scaled into account_system_time and
account_user_time_scaled into account_user_time.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Michael Neuling <mikey@neuling.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Impact: New APIs
The old node_to_cpumask/node_to_pcibus returned a cpumask_t: these
return a pointer to a struct cpumask. Part of removing cpumasks from
the stack.
We can also use the new for_each_cpu_and() to avoid a temporary cpumask,
and a gratuitous test in sn_topology_show.
(Includes fix from KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Impact: change existing irq_chip API
Not much point with gentle transition here: the struct irq_chip's
setaffinity method signature needs to change.
Fortunately, not widely used code, but hits a few architectures.
Note: In irq_select_affinity() I save a temporary in by mangling
irq_desc[irq].affinity directly. Ingo, does this break anything?
(Folded in fix from KOSAKI Motohiro)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Acked-by: Ingo Molnar <mingo@redhat.com>
Cc: ralf@linux-mips.org
Cc: grundler@parisc-linux.org
Cc: jeremy@xensource.com
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Impact: change calling convention of existing cpumask APIs
Most cpumask functions started with cpus_: these have been replaced by
cpumask_ ones which take struct cpumask pointers as expected.
These four functions don't have good replacement names; fortunately
they're rarely used, so we just change them over.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: paulus@samba.org
Cc: mingo@redhat.com
Cc: tony.luck@intel.com
Cc: ralf@linux-mips.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: cl@linux-foundation.org
Cc: srostedt@redhat.com
Impact: cleanup
Each SMP arch defines these themselves. Move them to a central
location.
Twists:
1) Some archs (m32, parisc, s390) set possible_map to all 1, so we add a
CONFIG_INIT_ALL_POSSIBLE for this rather than break them.
2) mips and sparc32 '#define cpu_possible_map phys_cpu_present_map'.
Those archs simply have phys_cpu_present_map replaced everywhere.
3) Alpha defined cpu_possible_map to cpu_present_map; this is tricky
so I just manipulate them both in sync.
4) IA64, cris and m32r have gratuitous 'extern cpumask_t cpu_possible_map'
declarations.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Mike Travis <travis@sgi.com>
Cc: ink@jurassic.park.msu.ru
Cc: rmk@arm.linux.org.uk
Cc: starvik@axis.com
Cc: tony.luck@intel.com
Cc: takata@linux-m32r.org
Cc: ralf@linux-mips.org
Cc: grundler@parisc-linux.org
Cc: paulus@samba.org
Cc: schwidefsky@de.ibm.com
Cc: lethal@linux-sh.org
Cc: wli@holomorphy.com
Cc: davem@davemloft.net
Cc: jdike@addtoit.com
Cc: mingo@redhat.com
With the introduction of the generic affinity autoselector,
irq_select_affinity(), IRQs are now being retargetted,
using a default mask, via the request_irq() path.
This results in all IRQs targetted at CPU 0.
SN Altix assigns affinity in the SN PROM, and does not
expect that to be changed as part of request_irq().
Set the IRQ_AFFINITY_SET flag to prevent
request_irq() from resetting affinity.
Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The generic_defconfig has three section mismatches. This clears
arch_unregister_cpu()
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The generic_defconfig has three section mismatches. This clears up
sn_check_wars().
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The AUTOFS=y and AUTOFS4=y causes problems with some distros versions of
automount. I turned both of those to =m and then followed the default
prompts for everything else. I did notice that CONFIG_PNP_DEBUG got
changed to CONFIG_PNP_DEBUG_MESSAGES and the default was a =y so I turned
that back to a =n.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
As noted by Akinobu Mita in patch b1fceac2b9,
alloc_bootmem and related functions never return NULL and always return a
zeroed region of memory. Thus a NULL test or memset after calls to these
functions is unnecessary.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
CC arch/ia64/kernel/asm-offsets.s
In file included from include/linux/bitops.h:17,
from include/linux/kernel.h:15,
from include/linux/sched.h:52,
from arch/ia64/kernel/asm-offsets.c:9:
arch/ia64/include/asm/bitops.h: In function 'set_bit':
arch/ia64/include/asm/bitops.h:47: error: implicit declaration of function 'BUILD_BUG_ON'
Obvious inclusion of kernel.h doesn't fix it, because of circular dependencies
involving fls.h and log2(). Fixing the latter requires some serious header surgery,
it seems, so just remove BUILD_BUG_ON for now.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Simply replace netdev->priv with netdev_priv().
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All architectures now use the generic compat_sys_ptrace, as should every
new architecture that needs 32bit compat (if we'll ever get another).
Remove the now superflous __ARCH_WANT_COMPAT_SYS_PTRACE define, and also
kill a comment about __ARCH_SYS_PTRACE that was added after
__ARCH_SYS_PTRACE was already gone.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mathieu Desnoyers reported this build failure on powerpc:
kernel/sched.c: In function 'sd_init_NODE':
kernel/sched.c:7319: error: non-static initialization of a flexible array member
kernel/sched.c:7319: error: (near initialization for '(anonymous)')
this happens because .span changed to cpumask_var_t, hence
the static CPU_MASK_NONE initializers in the SD_*_INIT
templates are not type-correct anymore.
Remove them, as they default to empty anyway.
Also remove them from IA64, MIPS and SH.
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
PAL_VPS_RESUME_HANDLER should use r26 to hold vac fields according to SDM.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Use CFLAGS_vcpu.o, not EXTRA_CFLAGS, to provide fixed register information
to the compiler.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
fix xen_get_eflags. It doesn't take any argument.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <tony.luck@intel.com>
pv_cpu_ops.getreg(_IA64_REG_IP) returned constant.
But the returned ip valued should be the one in the caller, not of the callee.
This patch fixes that.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <tony.luck@intel.com>
arch/ia64/kernel/pci-dma.c only needs to include iommu once.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Using printk from MCA/INIT context is unsafe since it can cause deadlock.
The ia64_mca_modify_original_stack is called from both of mca handler and
init handler, so it should use mprintk instead of printk.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Itanium processors can handle some misaligned data accesses. They
also provide a mode where all such accesses are forced to trap. The
kernel was schizophrenic about use of this mode:
* Base kernel code ran in permissive mode where the only traps
generated were from those cases that the h/w could not handle.
* Interrupt, syscall and trap code ran in strict mode where all
unaligned accesses caused traps to the 0x5a00 unaligned reference
vector.
Use strict alignment checking throughout the kernel, but make
sure that we continue to let user mode use more relaxed mode
as the default.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Use RCU to access another task's creds and to release a task's own creds.
This means that it will be possible for the credentials of a task to be
replaced without another task (a) requiring a full lock to read them, and (b)
seeing deallocated memory.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Wrap current->cred and a few other accessors to hide their actual
implementation.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Separate the task security context from task_struct. At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.
Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.
With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>