I applied the penultimate version of the perfmon patch, which didn't have
the initialization of the new spinlock that was added.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Patch from Charles Spirakis
Some linux customers want to optimize their applications on the latest
hardware but are not yet willing to upgrade to the latest kernel. This
patch provides a way to plug in an alternate, basic, and GPL'ed PMU
subsystem to help with their monitoring needs or for specialty work. It
can also be used in case of serious unexpected bugs in perfmon. Mutual
exclusion between the two subsystems is guaranteed, hence no conflict
can arise from both subsystem being present.
Acked-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The 2.6 kernel has CPE error thresholding.
This patch lets SAL know of this error handling feature.
The changes are SN specific.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
acpi_request_vector() is called in ia64_mca_init() to get the cpe_vector.
The problem is that acpi_request_vector() looks in platform_intr_list[] to
get the vector, but platform_intr_list[] is not initialized with a valid
vector until later (in sn_setup()). Without a valid vector the code
defaults to polling mode.
This patch moves the call to acpi_request_vector() from ia64_mca_init()
to ia64_mca_late_init(), which is after platform_intr_list[] is initialized.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
convert_to_non_syscall() has the same problem that unwind_to_user()
used to have. Fix it likewise.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
These days <linux/ioctl32.h> handles everything, no need for an asm
header on just two architectures.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some time ago, GAS was fixed to bring the .spillpsp directive in line
with the Intel assembler manual (there was some disagreement as to
whether or not there is a built-in 16-byte offset). Unfortunately,
there are two places in the kernel where this directive is used in
handwritten assembly files and those of course relied on the "buggy"
behavior. As a result, when using a "fixed" assembler, the kernel
picks up the UNaT bits from the wrong place (off by 16) and randomly
sets NaT bits on the scratch registers. This can be noticed easily by
looking at a coredump and finding various scratch registers with
unexpected NaT values. The patch below fixes this by using the
.spillsp directive instead, which works correctly no matter what
assembler is in use.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
I noticed this typo when trying to compile a kernel which had
CONFIG_HOTPLUG turned off. In that case, __devinit is no longer a
no-op and the compiler then detects a section-conflict. Fix by using
__devinitdata instead of __devinit.
Same patch also submitted by Darren Williams to fix compilation error
using sim_defconfig (which has CONFIG_HOTPLUG=n).
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Darren Williams <dsw@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Without this patch, the stack is placed _below_ the current task
structure, which is risky at best.
Tony, I think this patch needs to go into 2.6.12, since it fixes a
real bug. Without it, INIT may case secondary errors, which would be
most unpleasant.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
While looking at code generated by gcc4.0 I noticed some functions still
had frame pointers, even after we stopped ppc64 from defining
CONFIG_FRAME_POINTER. It turns out kernel/Makefile hardwires
-fno-omit-frame-pointer on when compiling schedule.c.
Create CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER and define it on architectures
that dont require frame pointers in sched.c code.
(akpm: blame me for the name)
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove the p_nodepda and p_subnodepda pointers from the pda_s structure.
And then define a new per-cpu pointer to the nodepda and export it so
that it can be accessed by kernel modules.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
- pfm_context_load(): change return value from EINVAL to EBUSY
when context is already loaded.
- pfm_check_task_state(): pass test if context state is MASKED.
It is safe to give access on PFM_CTX_MASKED because the PMU
state (PMD) is stable and saved in software state.
This helps multiplexing programs such as the example given
in libpfm-3.1.
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The pmu_active test is based on the values of PSR.up. THIS IS THE PROBLEM as
it does not take into account the lazy restore logic which is as follow (simplified):
context switch out:
save PMDs
clear psr.up
release ownership
context switch in:
if (ctx->last_cpu == smp_processor_id() && ctx->cpu_activation == cpu_activation) {
set psr.up
return
}
restore PMD
restore PMC
ctx->last_cpu = smp_processor_id();
ctx->activation = ++cpu_activation;
set psr.up
The key here is that on context switch out, we clear psr.up and on context switch in
we check if nobody else used the PMU on that processor since last time we came. In
that case, we assume the PMD/PMC are ours and we simply reactivate.
The Caliper problem is that between the moment we context switch out and the moment we
come back, nobody effectively used the PMU BUT the processor went idle. Normally this
would have no incidence but PAL_HALT does alter the PMU registers. In default_idle(),
the test on psr.up is not strong enough to cover this case and we go into PAL which
trashed the PMU resgisters. When we come back we falsely assume that this is our state
yet it is corrupted. Very nasty indeed.
To avoid the problem it is necessary to forbid going to PAL_HALT as soon as perfmon
installs some valid state in the PMU registers. This happens with an application
attaches a context to a thread or CPU. It is not enough to check the psr/dcr bits.
Hence I propose the attached patch. It adds a callback in process.c to modify the
condition to enter PAL on idle. Basically, now it is conditional to pal_halt=1 AND
perfmon saying it is okay.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Correct a bug where tioca_dma_mapped() is putting tioca dma map structs
on the wrong list.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When SAL calls back into the OS, the OS code is running with preempt
disabled so it cannot call sleeping functions.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Jack Steiner uncovered some opportunities for improvement in
the MCA recovery code.
1) Set bsp to save registers on the kernel stack.
2) Disable interrupts while in the MCA recovery code.
3) Change the way the user process is killed, to avoid
a panic in schedule.
Testing shows that these changes make the recovery code much
more reliable with the 2.6.12 kernel.
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Attached is a patch against David's audit.17 kernel that adds checks
for the TIF_SYSCALL_AUDIT thread flag to the ia64 system call and
signal handling code paths. The patch enables auditing of system
calls set up via fsys_bubble_down, as well as ensuring that
audit_syscall_exit() is called on return from sigreturn.
Neglecting to check for TIF_SYSCALL_AUDIT at these points results in
incorrect information in audit_context, causing frequent system panics
when system call auditing is enabled on an ia64 system.
I have tested this patch and have seen no problems with it.
[Original patch from Amy Griffis ported to current kernel by David Woodhouse]
From: Amy Griffis <amy.griffis@hp.com>
From: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Chris Wright <chrisw@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Andi noted that during normal runtime cpu_idle_map is bounced around a lot,
and occassionally at a higher frequency than the timer interrupt wakeup
which we normally exit pm_idle from. So switch to a percpu variable.
I didn't move things to the slow path because it would involve adding
scheduler code to wakeup the idle thread on the cpus we're waiting for.
Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The following patch fixes a bug in the SGI Altix sn_dma_flush code.
sn_dma_flush is broken in 2.6. The code isn't waiting for the DMA
data to be flushed out of the PIC ASIC. This patch is based off the
linux-ia64-test-2.6.12 tree
Signed-off-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch simplifies a couple places where we search for _PXM
values in ACPI namespace. Thanks,
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The following patch ensures that the correct error interrupt handling
routine is initialized. This patch is based on the 2.6.12 ia64 release tree.
Signed-off-by: Colin Ngam <cngam@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch detects the existence of an uncached physical AMO address setup
by EFI's XPBOOT (SGI) and converts it to an uncached virtual AMO address.
Depends on a patch submitted on 23 March 2005 with the subject of:
[PATCH 2/3] SGI Altix cross partition functionality (2nd revision)
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch contains the cross partition pseudo-ethernet driver (XPNET)
functional support module.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch contains the communication module (XPC) for cross partition
communication on a partitioned SGI Altix.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
cg-patch couldn't apply the patch to Makefile, and my dumb script
rushed on and ran cg-commit without this change.
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch contains the shim module (XP) which interfaces between the
communication module (XPC) and the functional support modules (like XPNET).
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Another step in the effort to eliminate the SN pda structure.
This patch moves the cnodeid_to_nasid_table field out of the pda,
making it a standalone per-cpu data item, and exports it so it can
be accessed by kernel modules.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Here is a patch to enable the SGI tiocx bus driver to distingush between
FPGA-attached h/w and non-FPGA-attached h/w.
Signed-off-by: Bruce Losure <blosure@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Hi Tony,
This patch against ia64-test-2.6.12 fixes a bug where the tiocx code
was inadvertently un-doing some address modifications done in earlier
fixup code. This patch just removes the offending code.
Signed-off-by: Bruce Losure <blosure@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This is a small patch to switch fluch_icache_range() to use fc.i
instead of fc. This would save time on processors which can establish
i-cache coherency without flushing the cache-line out to memory (not
that any current processors do). On existing processors, fc.i behaves
like fc. The only caveat is that very old assemblers may not know
about fc.i yet.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Patch below fixes 3 trivial typos which are caught by the new
assembler (v2.169.90). Please apply.
[Note: fix to memcpy that was also part of this patch was separately
applied from patches by H.J. and Andreas ... so the delta here only
has the other two fixes. -Tony]
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The current ia64 assembler complains about mismatching .proc/.endp pairs.
(Same patch also sent by H.J. Lu)
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Now that we have MC/MT detection patches in, appended patch allows us to
configure MT scheduler optimizations. For now, we will this option off
by default.
There is some discussion going on lkml about setting up sched-domains
which are absolutely needed (like for example, we shouldn't setup SMT domain
for non MT processors). Once that patch goes in, we can enable this option by
default.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Convert most of the current code that uses _NSIG directly to instead use
valid_signal(). This avoids gcc -W warnings and off-by-one errors.
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Attached is a patch against David's audit.17 kernel that adds checks
for the TIF_SYSCALL_AUDIT thread flag to the ia64 system call and
signal handling code paths.The patch enables auditing of system
calls set up via fsys_bubble_down, as well as ensuring that
audit_syscall_exit() is called on return from sigreturn.
Neglecting to check for TIF_SYSCALL_AUDIT at these points results in
incorrect information in audit_context, causing frequent system panics
when system call auditing is enabled on an ia64 system.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We were calling ptrace_notify() after auditing the syscall and arguments,
but the debugger could have _changed_ them before the syscall was actually
invoked. Reorder the calls to fix that.
While we're touching ever call to audit_syscall_entry(), we also make it
take an extra argument: the architecture of the syscall which was made,
because some architectures allow more than one type of syscall.
Also add an explicit success/failure flag to audit_syscall_exit(), for
the benefit of architectures which return that in a condition register
rather than only returning a single register.
Change type of syscall return value to 'long' not 'int'.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
[IA64] fix ia64 Kconfig to allow CONFIG_PM on sn2
This probably should have been fixed when I fixed up the generic build for
discontig+numa machines, but oh well.
CONFIG_PM is allowable for generic builds but not for sn2 builds, which
doesn't make much sense, and in fact breaks the build if recent ACPI bits are
added to the tree. It looks like the only arch that needs to prevent
CONFIG_PM stuff is the ski simulator (though those options could probably use
some cleanup as well), so remove the big conditional and replace it with a
simple test for IA64_HP_SIM instead.
Signed-off-by: Jesse Barnes <jbarnes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
vector sharing patch had a typo ... mismatched spin_lock() with
a spin_unlock_irq(). Fix from Kenji Kaneshige.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Rohit and Suresh changed their mind about the order to print things
in /proc/cpuinfo, but didn't include the change in the version of
the patch they sent to me.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Current ia64 linux cannot handle greater than 184 interrupt sources
because of the lack of vectors. The following patch enables ia64 linux
to handle greater than 184 interrupt sources by allowing the same
vector number to be shared by multiple IOSAPIC's RTEs. The design of
this patch is besed on "Intel(R) Itanium(R) Processor Family Interrupt
Architecture Guide".
Even if you don't have a large I/O system, you can see the behavior of
vector sharing by changing IOSAPIC_LAST_DEVICE_VECTOR to fewer value.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Version 3 - rediffed to apply on top of Ashok's hotplug cpu
patch. /proc/cpuinfo output in step with x86.
This is an updated MC/MT identification patch based on the
previous discussions on list.
Add the Multi-core and Multi-threading detection for IPF.
- Add new core and threading related fields in /proc/cpuinfo.
Physical id
Core id
Thread id
Siblings
- setup the cpu_core_map and cpu_sibling_map appropriately
- Handles Hot plug CPU
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Gordon Jin <gordon.jin@intel.com>
Signed-off-by: Rohit Seth <rohit.seth@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
memcpy_mck.S::__copy_user breaks in the prefetch code under these conditions :-
* src is unaligned and
* dst is near the end of a page and
* the page after dst is unmapped.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Thanks to Mark for tracking down this one. Users of __copy_from_user_inatomic()
will be sad if we don't handle lfetch faults for the "no_context" case.
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch against ia64-test-2.6.12 is needed for forthcoming
Altix chipsets. It renames geoid_any_t to geoid_common_t and
splits the 8bit 'slab' field into two 4bit fields for 'slab'
and 'slot'. Similar changes in the Altix SAL will retain backward
compatibility for old kernels.
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Sadly, I goofed in this syscall-tuning patch:
ChangeSet 1.1966.1.40 2005/01/22 13:31:05 davidm@hpl.hp.com
[IA64] Improve ia64_leave_syscall() for McKinley-type cores.
Optimize ia64_leave_syscall() a bit better for McKinley-type cores.
The patch looks big, but that's mostly due to renaming r16/r17 to r2/r3.
Good for a 13 cycle improvement.
The problem is that the size of the physical stacked registers was
loaded into the wrong register (r3 instead of r17). Since r17 by
coincidence always had the value 1, this had the effect of turning
rse_clear_invalid into a no-op. That poses the risk of leaking kernel
state back to user-land and is hence not acceptable.
The fix below is simple, but unfortunately it costs us about 28 cycles
in syscall overhead. ;-(
Unfortunately, there isn't much we can do about that since those
registers have to be cleared one way or another.
--david
Signed-off-by: Tony Luck <tony.luck@intel.com>
patch 2:
Shub2 BTE recovery code will be implemented in SAL.
Define the SAL interface.
Modify bte_error to call SAL for shub2.
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>