Impact: cleanup
Move the BTS bits from ptrace.c into ds.c.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: pack struct thread_info more tightly
Change x86_64's thread_info 'flags' field back to __u32.
This was changed to 'unsigned long' when the thread_info*.h
for i386 and x86_64 were merged. Change it back. We can
do this as only 27 bits of 'flags' are actually used.
This change actually packs down thread_info by 64 bits:
32 bits are saved by the smaller flags, and 32 bits are
saved by the following 'mm_segment_t field' becoming
naturally 64-bit aligned.
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: use deeper function tracing depth safely
Some tests showed that function return tracing needed a more deeper depth
of function calls. But it could be unsafe to store these return addresses
to the stack.
So these arrays will now be allocated dynamically into task_struct of current
only when the tracer is activated.
Typical scheme when tracer is activated:
- allocate a return stack for each task in global list.
- fork: allocate the return stack for the newly created task
- exit: free return stack of current
- idle init: same as fork
I chose a default depth of 50. I don't have overruns anymore.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: help to find the better depth of trace
We decided to arbitrary define the depth of function return trace as
"20". Perhaps this is not enough. To help finding an optimal depth, we
measure now the overrun: the number of functions that have been missed
for the current thread. By default this is not displayed, we have to
do set a particular flag on the return tracer: echo overrun >
/debug/tracing/trace_options And the overrun will be printed on the
right.
As the trace shows below, the current 20 depth is not enough.
update_wall_time+0x37f/0x8c0 -> update_xtime_cache (345 ns) (Overruns: 2838)
update_wall_time+0x384/0x8c0 -> clocksource_get_next (1141 ns) (Overruns: 2838)
do_timer+0x23/0x100 -> update_wall_time (3882 ns) (Overruns: 2838)
tick_do_update_jiffies64+0xbf/0x160 -> do_timer (5339 ns) (Overruns: 2838)
tick_sched_timer+0x6a/0xf0 -> tick_do_update_jiffies64 (7209 ns) (Overruns: 2838)
vgacon_set_cursor_size+0x98/0x120 -> native_io_delay (2613 ns) (Overruns: 274)
vgacon_cursor+0x16e/0x1d0 -> vgacon_set_cursor_size (33151 ns) (Overruns: 274)
set_cursor+0x5f/0x80 -> vgacon_cursor (36432 ns) (Overruns: 274)
con_flush_chars+0x34/0x40 -> set_cursor (38790 ns) (Overruns: 274)
release_console_sem+0x1ec/0x230 -> up (721 ns) (Overruns: 274)
release_console_sem+0x225/0x230 -> wake_up_klogd (316 ns) (Overruns: 274)
con_flush_chars+0x39/0x40 -> release_console_sem (2996 ns) (Overruns: 274)
con_write+0x22/0x30 -> con_flush_chars (46067 ns) (Overruns: 274)
n_tty_write+0x1cc/0x360 -> con_write (292670 ns) (Overruns: 274)
smp_apic_timer_interrupt+0x2a/0x90 -> native_apic_mem_write (330 ns) (Overruns: 274)
irq_enter+0x17/0x70 -> idle_cpu (413 ns) (Overruns: 274)
smp_apic_timer_interrupt+0x2f/0x90 -> irq_enter (1525 ns) (Overruns: 274)
ktime_get_ts+0x40/0x70 -> getnstimeofday (465 ns) (Overruns: 274)
ktime_get_ts+0x60/0x70 -> set_normalized_timespec (436 ns) (Overruns: 274)
ktime_get+0x16/0x30 -> ktime_get_ts (2501 ns) (Overruns: 274)
hrtimer_interrupt+0x77/0x1a0 -> ktime_get (3439 ns) (Overruns: 274)
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add infrastructure for function-return tracing
Add low level support for ftrace return tracing.
This plug-in stores return addresses on the thread_info structure of
the current task.
The index of the current return address is initialized when the task
is the first one (init) and when a process forks (the child). It is
not needed when a task does a sys_execve because after this syscall,
it still needs to return on the kernel functions it called.
Note that the code of return_to_handler has been suggested by Steven
Rostedt as almost all of the ideas of improvements in this V3.
For purpose of security, arch/x86/kernel/process_32.c is not traced
because __switch_to() changes the current task during its execution.
That could cause inconsistency in the stored return address of this
function even if I didn't have any crash after testing with tracing on
this function enabled.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change header guards named "ASM_X86__*" to "_ASM_X86_*" since:
a. the double underscore is ugly and pointless.
b. no leading underscore violates namespace constraints.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Uses xsave/xrstor (instead of traditional fxsave/fxrstor) in context switch
when available.
Introduces TS_XSAVE flag, which determine the need to use xsave/xrstor
instructions during context switch instead of the legacy fxsave/fxrstor
instructions. Thread-synchronous status word is already in L1 cache during
this code patch and thus minimizes the performance penality compared to
(cpu_has_xsave) checks.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This adds TIF_NOTIFY_RESUME support for x86, both 64-bit and 32-bit.
When set, we call tracehook_notify_resume() on the way to user mode.
Signed-off-by: Roland McGrath <roland@redhat.com>
We duplicate alloc/free_thread_info defines on many platforms (the
majority uses __get_free_pages/free_pages). This patch defines common
defines and removes these duplicated defines.
__HAVE_ARCH_THREAD_INFO_ALLOCATOR is introduced for platforms that do
something different.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is the result of an automatic script that consolidates the
format of all the headers in include/asm-x86/.
The format:
1. No leading underscore. Names with leading underscores are reserved.
2. Pathname components are separated by two underscores. So we can
distinguish between mm_types.h and mm/types.h.
3. Everything except letters and numbers are turned into single
underscores.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
random uvesafb failures were reported against Gentoo:
http://bugs.gentoo.org/show_bug.cgi?id=222799
and Mihai Moldovan bisected it back to:
> 8f4d37ec07 is first bad commit
> commit 8f4d37ec07
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Fri Jan 25 21:08:29 2008 +0100
>
> sched: high-res preemption tick
Linus suspected it to be hrtick + vm86 interaction and observed:
> Btw, Peter, Ingo: I think that commit is doing bad things. They aren't
> _incorrect_ per se, but they are definitely bad.
>
> Why?
>
> Using random _TIF_WORK_MASK flags is really impolite for doing
> "scheduling" work. There's a reason that arch/x86/kernel/entry_32.S
> special-cases the _TIF_NEED_RESCHED flag: we don't want to exit out of
> vm86 mode unnecessarily.
>
> See the "work_notifysig_v86" label, and how it does that
> "save_v86_state()" thing etc etc.
Right, I never liked having to fiddle with those TIF flags. Initially I
needed it because the hrtimer base lock could not nest in the rq lock.
That however is fixed these days.
Currently the only reason left to fiddle with the TIF flags is remote
wakeups. We cannot program a remote cpu's hrtimer. I've been thinking
about using the new and improved IPI function call stuff to implement
hrtimer_start_on().
However that does require that smp_call_function_single(.wait=0) works
from interrupt context - /me looks at the latest series from Jens - Yes
that does seem to be supported, good.
Here's a stab at cleaning this stuff up ...
Mihai reported test success as well.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Mihai Moldovan <ionic@ionic.de>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Antonino Daplas <adaplas@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This closes some arcane holes in single-step handling that can arise
only when user programs set TF directly (via popf or sigreturn) and
then use vDSO (syscall/sysenter) system call entry. In those entry
paths, the clear_TF_reenable case hits and we must check TIF_SINGLESTEP
to be sure our bookkeeping stays correct wrt the user's view of TF.
Signed-off-by: Roland McGrath <roland@redhat.com>
This unifies and cleans up the syscall tracing code on i386 and x86_64.
Using a single function for entry and exit tracing on 32-bit made the
do_syscall_trace() into some terrible spaghetti. The logic is clear and
simple using separate syscall_trace_enter() and syscall_trace_leave()
functions as on 64-bit.
The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
tracing either 32-bit or 64-bit tasks. It behaves just like 32-bit.
Changing syscall_trace_enter() to return the syscall number shortens
all the assembly paths, while adding the SYSEMU feature in a simple way.
Signed-off-by: Roland McGrath <roland@redhat.com>
This unifies the treatment of TIF_SINGLESTEP on i386 and x86_64.
The bit is now excluded from _TIF_WORK_MASK on i386 as it has been
on x86_64. This means the do_notify_resume() path using it is never
used, so TIF_SINGLESTEP is not cleared on returning to user mode.
Both now leave TIF_SINGLESTEP set when returning to user, so that
it's already set on an int $0x80 system call entry. This removes
the need for testing TF on the system_call path. Doing it this way
fixes the regression for PTRACE_SINGLESTEP into a sigreturn syscall,
introduced by commit 1e2e99f0e4.
The clear_TF_reenable case that sets TIF_SINGLESTEP can only happen
on a non-exception kernel entry, i.e. sysenter/syscall instruction.
That will always get to the syscall exit tracing path.
Signed-off-by: Roland McGrath <roland@redhat.com>
Make them similar so that both use THREAD_ORDER and THREAD_FLAGS and have a
THREAD_SIZE definition that is setup in asm/page_xx.h
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The TIF masks are basically the same. x86_32 also has _TIF_SYSCALL_EMU which is
zero for the 64 bit case. The tif masks become the same.
x86_64 has an additional _TIF_DONOTIFY_MASK. Does not hurt for the 32 bit case
since it is only used in x86_64 arch code.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Both TIF lists are essentially the same. x86_32 also has TIF_SYSCALL_EMU which
must be undefined for the 64 bit case.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Same for both 32 and 64 bit so simply put it in to the common area.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Both definitions are the same. So move to common x86 area.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Merge the thread_info definition into one structure definition for both arches.
The __u32 is equal to unsigned long for 32 bit.
sysenter_return is used both for the IA32 emulation for 64 and x86_32.
Avoid complicated #ifdef by simply always including it.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Merge both. x86_64 has an additional TS_COMPAT that is harmless
for 32 bit.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move shared includes to a common area in thread_info.h
Adds asm/types.h for x86_64 and linux/compiler.h for x86_32. Not needed but
we can avoid some ifdeffing and it simplifies later joining.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Simple merge of both thread_info_32.h and thread_info_64.h into thread_info.h.
Comments for #ifndef __ASM_THREAD_INFO_H and #ifdef __KERNEL__
are the same.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Split the FPU save area from the task struct. This allows easy migration
of FPU context, and it's generally cleaner. It also allows the following
two optimizations:
1) only allocate when the application actually uses FPU, so in the first
lazy FPU trap. This could save memory for non-fpu using apps. Next patch
does this lazy allocation.
2) allocate the right size for the actual cpu rather than 512 bytes always.
Patches enabling xsave/xrstor support (coming shortly) will take advantage
of this.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move the headers to include/asm-x86 and fixup the
header install make rules
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>