In the event of a lock steal or owner died,
rt_mutex_start_proxy_lock() will give the rt_mutex to the
waiting task, but it fails to release the wait_lock. This leads
to subsequent deadlocks when other tasks try to acquire the
rt_mutex.
I also removed a few extra blank lines that really spaced this
routine out. I must have been high on the \n when I wrote this
originally...
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Dinakar Guniguntala <dino@in.ibm.com>
Cc: John Stultz <johnstul@linux.vnet.ibm.com>
LKML-Reference: <4A79D7F1.4000405@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The commit:
commit e0fdace10e
Author: David Miller <davem@davemloft.net>
Date: Fri Aug 1 01:11:22 2008 -0700
debug_locks: set oops_in_progress if we will log messages.
Otherwise lock debugging messages on runqueue locks can deadlock the
system due to the wakeups performed by printk().
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Will permanently set oops_in_progress on any lockdep failure.
When this triggers it will cause any read from the ring buffer to
permanently disable the ring buffer (not to mention no locking of
printk).
This patch removes the check. It keeps the print in NMI which makes
sense. This is probably OK, since the ring buffer should not cause
something to set oops_in_progress anyway.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The function ring_buffer_discard_commit inversed the code path
of the result of try_to_discard. It should skip incrementing the
entry counter if try_to_discard succeeded. But instead, it increments
the entry conder if it succeeded to discard, and does not increment
it if it fails.
The result of this bug is that filtering will make the stat counters
incorrect.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable. This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.
The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.
This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The state machine described in the comments wasn't updated with
a follow-on fix. Address that and cleanup the corresponding
commentary in the function.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <4A737C2A.9090001@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Two important aspects of the schedule_work() function are not
yet documented:
- that it is allowed to pass a struct work_struct * to this
function that is already on the kernel-global workqueue;
- the meaning of its return value.
The patch below documents both aspects.
Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Cc: "Greg Kroah-Hartman" <gregkh@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <200907301900.54202.bart.vanassche@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Prevent calling do_nanosleep() with clockid
CLOCK_MONOTONIC_RAW, it may cause oops, such as NULL pointer
dereference.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: <stable@kernel.org>
LKML-Reference: <4A764FF3.50607@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The unit is KB, so sizeof(struct circular_queue) should be
divided by 1024.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Cc: davem@davemloft.net
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: a.p.zijlstra@chello.nl
LKML-Reference: <1249220616-7190-1-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We still can apply DaveM's generation count optimization to
BFS, based on the following idea:
- before doing each BFS, increase the global generation id
by 1
- if one node in the graph has been visited, mark it as
visited by storing the current global generation id into
the node's dep_gen_id field
- so we can decide if one node has been visited already, by
comparing the node's dep_gen_id with the global generation id.
By applying DaveM's generation count optimization to current
implementation of BFS, we gain the following advantages:
- we save MAX_LOCKDEP_ENTRIES/8 bytes memory;
- we remove the bitmap_zero(bfs_accessed, MAX_LOCKDEP_ENTRIES);
in each BFS, which is very time-consuming since
MAX_LOCKDEP_ENTRIES may be very large.(16384UL)
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "David S. Miller" <davem@davemloft.net>
LKML-Reference: <1248274089-6358-1-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
spin_lock_nest_lock() allows to take many instances of the same
class, this can easily lead to overflow of MAX_LOCK_DEPTH.
To avoid this overflow, we'll stop accounting instances but
start reference counting the class in the held_lock structure.
[ We could maintain a list of instances, if we'd move the hlock
stuff into __lock_acquired(), but that would require
significant modifications to the current code. ]
We restrict this mode to spin_lock_nest_lock() only, because it
degrades the lockdep quality due to lost of instance.
For lockstat this means we don't track lock statistics for any
but the first lock in the series.
Currently nesting is limited to 11 bits because that was the
spare space available in held_lock. This yields a 2048
instances maximium.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a lockdep helper to validate that we indeed are the owner
of a lock.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fixes a few comments and whitespaces that annoyed me.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This build bug:
In file included from kernel/sched.c:1765:
kernel/sched_rt.c: In function ‘has_pushable_tasks’:
kernel/sched_rt.c:1069: error: ‘struct rt_rq’ has no member named ‘pushable_tasks’
kernel/sched_rt.c: In function ‘pick_next_task_rt’:
kernel/sched_rt.c:1084: error: ‘struct rq’ has no member named ‘post_schedule’
Triggers because both pushable_tasks and post_schedule are
SMP-only fields.
Move pushable_tasks() to the SMP section and #ifdef the post_schedule use.
Cc: Gregory Haskins <ghaskins@novell.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090729150422.17691.55590.stgit@dev.haskins.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix:
kernel/built-in.o: In function `lockdep_stats_show':
lockdep_proc.c:(.text+0x48202): undefined reference to `max_bfs_queue_depth'
As max_bfs_queue_depth is only available under
CONFIG_PROVE_LOCKING=y.
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Like sched_migrate_task(), set_cpus_allowed_ptr() should hold
onto the migration thread too.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A frequent mistake appears to be to call task_of() on a
scheduler entity that is not actually a task, which can result
in a wild pointer.
Add a check to catch these mistakes.
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reflect "active" cpus in the rq->rd->online field, instead of
the online_map.
The motivation is that things that use the root-domain code
(such as cpupri) only care about cpus classified as "active"
anyway. By synchronizing the root-domain state with the active
map, we allow several optimizations.
For instance, we can remove an extra cpumask_and from the
scheduler hotpath by utilizing rq->rd->online (since it is now
a cached version of cpu_active_map & rq->rd->span).
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Max Krasnyansky <maxk@qualcomm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090730145723.25226.24493.stgit@dev.haskins.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We currently have an explicit "needs_post" vtable method which
returns a stack variable for whether we should later run
post-schedule. This leads to an awkward exchange of the
variable as it bubbles back up out of the context switch. Peter
Zijlstra observed that this information could be stored in the
run-queue itself instead of handled on the stack.
Therefore, we revert to the method of having context_switch
return void, and update an internal rq->post_schedule variable
when we require further processing.
In addition, we fix a race condition where we try to access
current->sched_class without holding the rq->lock. This is
technically racy, as the sched-class could change out from
under us. Instead, we reference the per-rq post_schedule
variable with the runqueue unlocked, but with preemption
disabled to see if we need to reacquire the rq->lock.
Finally, we clean the code up slightly by removing the #ifdef
CONFIG_SMP conditionals from the schedule() call, and implement
some inline helper functions instead.
This patch passes checkpatch, and rt-migrate.
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090729150422.17691.55590.stgit@dev.haskins.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need to add the new prio to the cpupri accounting before
removing the old prio. This is because removing the old prio
first will open a race window where the cpu will be removed
from pri_active. In this case the cpu will not be visible for
RT push and pulls. This could cause a RT task to not migrate
appropriately, and create a very large latency.
This bug was found with the use of ftrace sched events and
trace_printk.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090729042526.438281019@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The current method for pushing RT tasks after scheduling only
happens after a context switch. But we found cases where a task
is set up on a run queue to be pushed but the push never
happens because the schedule chooses the same task.
This bug was found with the help of Gregory Haskins and the use
of ftrace (trace_printk). It tooks several days for both of us
analyzing the code and the trace output to find this.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090729042526.205923666@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When cgroup group scheduling is built in, skip some code paths
if we don't have any (but the root) cgroups configured.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit ec4e0e2fe0 ("fix
inconsistency when redistribute per-cpu tg->cfs_rq shares")
broke cgroup smp fairness.
In order to avoid starvation of newly placed tasks, we never
quite set the share of an empty cpu group-task to 0, but
instead we set it as if there's a single NICE-0 task present.
If however we actually set this in cfs_rq[cpu]->shares, that
means the total shares for that group will be slightly inflated
every time we balance, causing the observed unfairness.
Fix this by setting cfs_rq[cpu]->shares to 0 but actually
setting the effective weight of the related se to the inflated
number.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1248696557.6987.1615.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Background:
Several race conditions in the scheduler have cropped up
recently, which Steven and I have tracked down using ftrace.
The most recent one turns out to be a race in how the scheduler
determines a suitable migration target for RT tasks, introduced
recently with commit:
commit 68e74568fb
Date: Tue Nov 25 02:35:13 2008 +1030
sched: convert struct cpupri_vec cpumask_var_t.
The original design of cpupri allowed lockless readers to
quickly determine a best-estimate target. Races between the
pri_active bitmap and the vec->mask were handled in the
original code because we would detect and return "0" when this
occured. The design was predicated on the *effective*
atomicity (*) of caching the result of cpus_and() between the
cpus_allowed and the vec->mask.
Commit 68e74568 changed the behavior such that vec->mask is
accessed multiple times. This introduces a subtle race, the
result of which means we can have a result that returns "1",
but with an empty bitmap.
*) yes, we know cpus_and() is not a locked operator across the
entire composite array, but it is implicitly atomic on a
per-word basis which is all the design required to work.
Implementation:
Rather than forgoing the lockless design, or reverting to a
stack-based cpumask_t, we simply check for when the race has
been encountered and continue processing in the event that the
race is hit. This renders the removal race as if the priority
bit had been atomically cleared as well, and allows the
algorithm to execute correctly.
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090730145728.25226.92769.stgit@dev.haskins.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The latencytop and sleep accounting code assumes that any
scheduler entity represents a task, this is not so.
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In order to be able to distinguish between no samples due to
inactivity and no samples due to task ended, Arjan asked for
PERF_EVENT_EXIT events. This is useful to the boot delay
instrumentation (bootchart) app.
This patch changes the PERF_EVENT_FORK to be emitted on every
clone, and adds PERF_EVENT_EXIT to be emitted on task exit,
after the task's counters have been closed.
This task tracing is controlled through: attr.comm || attr.mmap
and through the new attr.task field.
Suggested-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
[ cleaned up perf_counter.h a bit ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently the counter value returned by read() is the value of
the parent counter, to which child counters are only fed back
on child exit.
Thus read() can return rather erratic (and meaningless) numbers
depending on the state of the child processes.
Change this by always iterating the full child hierarchy on
read() and sum all counters.
Suggested-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When debugging a recent lockup bug i found various deficiencies
in how our current lockup detection helpers work:
- SysRq-L is not very efficient as it uses a workqueue, hence
it cannot punch through hard lockups and cannot see through
most soft lockups either.
- The SysRq-L code depends on the NMI watchdog - which is off
by default.
- We dont print backtraces from the RCU code's built-in
'RCU state machine is stuck' debug code. This debug
code tends to be one of the first (and only) mechanisms
that show that a lockup has occured.
This patch changes the code so taht we:
- Trigger the NMI backtrace code from SysRq-L instead of using
a workqueue (which cannot punch through hard lockups)
- Trigger print-all-CPU-backtraces from the RCU lockup detection
code
Also decouple the backtrace printing code from the NMI watchdog:
- Dont use variable size cpumasks (it might not be initialized
and they are a bit more fragile anyway)
- Trigger an NMI immediately via an IPI, instead of waiting
for the NMI tick to occur. This is a lot faster and can
produce more relevant backtraces. It will also work if the
NMI watchdog is disabled.
- Dont print the 'dazed and confused' message when we print
a backtrace from the NMI
- Do a show_regs() plus a dump_stack() to get maximum info
out of the dump. Worst-case we get two stacktraces - which
is not a big deal. Sometimes, if register content is
corrupted, the precise stack walker in show_regs() wont
give us a full backtrace - in this case dump_stack() will
do it.
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The previous commit ("do_sigaltstack: avoid copying 'stack_t' as a
structure to user space") fixed a real bug. This one just cleans up the
copy from user space to that gcc can generate better code for it (and so
that it looks the same as the later copy back to user space).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ulrich Drepper correctly points out that there is generally padding in
the structure on 64-bit hosts, and that copying the structure from
kernel to user space can leak information from the kernel stack in those
padding bytes.
Avoid the whole issue by just copying the three members one by one
instead, which also means that the function also can avoid the need for
a stack frame. This also happens to match how we copy the new structure
from user space, so it all even makes sense.
[ The obvious solution of adding a memset() generates horrid code, gcc
does really stupid things. ]
Reported-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use kernel_text_address() for checking probe address instead of
__kernel_text_address(), because __kernel_text_address() returns true
for init functions even after relaseing those functions.
That will hit a BUG() in text_poke().
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When profile= is used, a large buffer is allocated early at boot. This
can be larger than what the page allocator can provide so it prints a
warning. However, the caller is able to handle the situation so this
patch suppresses the warning.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After commit ec64f51545 ("cgroup: fix
frequent -EBUSY at rmdir"), cgroup's rmdir (especially against memcg)
doesn't return -EBUSY by temporary ref counts. That commit expects all
refs after pre_destroy() is temporary but...it wasn't. Then, rmdir can
wait permanently. This patch tries to fix that and change followings.
- set CGRP_WAIT_ON_RMDIR flag before pre_destroy().
- clear CGRP_WAIT_ON_RMDIR flag when the subsys finds racy case.
if there are sleeping ones, wakes them up.
- rmdir() sleeps only when CGRP_WAIT_ON_RMDIR flag is set.
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reported-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by: Paul Menage <menage@google.com>
Acked-by: Balbir Sigh <balbir@linux.vnet.ibm.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The bug was introduced by commit cc31edceee
("cgroups: convert tasks file to use a seq_file with shared pid array").
We cache a pid array for all threads that are opening the same "tasks"
file, but the pids in the array are always from the namespace of the
last process that opened the file, so all other threads will read pids
from that namespace instead of their own namespaces.
To fix it, we maintain a list of pid arrays, which is keyed by pid_ns.
The list will be of length 1 at most time.
Reported-by: Paul Menage <menage@google.com>
Idea-by: Paul Menage <menage@google.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: Serge Hallyn <serue@us.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Setting
"crashkernel=512M-2G:64M,2G-:128M"
does not work but it turns to work if it has a trailing-whitespace,
like
"crashkernel=512M-2G:64M,2G-:128M ".
It was because of a bug in the parser, running over the cmdline.
This patch adds a check of the termination.
Reported-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Tested-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a post-2.6.31 regression which was introduced by
2ff05b2b4e ("oom: move oom_adj value from
task_struct to mm_struct").
After moving the oom_adj value from the task struct to the mm_struct, the
oom_adj value was no longer properly inherited by child processes.
Copying over the oom_adj value at fork time fixes that bug.
[kosaki.motohiro@jp.fujitsu.com: test for current->mm before dereferencing it]
Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Paul Menage <manage@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
About a half events are missing when we splice_read
from trace_pipe. They are unexpectedly consumed because we ignore
the TRACE_TYPE_NO_CONSUME return value used by the function graph
tracer when it needs to consume the events by itself to walk on
the ring buffer.
The same problem appears with ftrace_dump()
Example of an output before this patch:
1) | ktime_get_real() {
1) 2.846 us | read_hpet();
1) 4.558 us | }
1) 6.195 us | }
After this patch:
0) | ktime_get_real() {
0) | getnstimeofday() {
0) 1.960 us | read_hpet();
0) 3.597 us | }
0) 5.196 us | }
The fix also applies on 2.6.30
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: stable@kernel.org
LKML-Reference: <4A6EEC52.90704@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
When print_graph_entry() computes a function call entry event, it needs
to also check the next entry to guess if it matches the return event of
the current function entry.
In order to look at this next event, it needs to consume the current
entry before going ahead in the ring buffer.
However, if the current event that gets consumed is the last one in the
ring buffer head page, the ring_buffer may reuse the page for writers.
The consumed entry will then become invalid because of possible
racy overwriting.
Me must then handle this entry by making a copy of it.
The fix also applies on 2.6.30
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: stable@kernel.org
LKML-Reference: <4A6EEAEC.3050508@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Commit 63706172f3 ("kthreads: rework
kthread_stop()") removed the limitation that the thread function mysr
not call do_exit() itself, but forgot to update the comment.
Since that commit it is OK to use kthread_stop() even if kthread can
exit itself.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The check_modstruct_version() needs to look up the symbol "module_layout"
in the kernel, but it does so literally and not by a C identifier. The
trouble is that it does not include a symbol prefix for those ports that
need it (like the Blackfin and H8300 port). So make sure we tack on the
MODULE_SYMBOL_PREFIX define to the front of it.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If trace_printk_on_oops is set we lose interesting trace information
when the tracer is enabled across oops handling and printing. We want
the trace which might give us information _WHY_ we oopsed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Some cleanups of the lockdep code after the BFS series:
- Remove the last traces of the generation id
- Fixup comment style
- Move the bfs routines into lockdep.c
- Cleanup the bfs routines
[ tom.leiming@gmail.com: Fix crash ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1246201486-7308-11-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add BFS statistics to the existing lockdep stats.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1246201486-7308-10-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Also account the BFS memory usage.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
[ fix build for !PROVE_LOCKING ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1246201486-7308-9-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Implement lockdep_count_{for,back}ward using BFS.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1246201486-7308-8-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since the shortest lock dependencies' path may be obtained by BFS,
we print the shortest one by print_shortest_lock_dependencies().
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1246201486-7308-7-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch uses BFS to implement find_usage_*wards(),which
was originally writen by DFS.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1246201486-7308-6-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch uses BFS to implement check_noncircular() and
prints the generated shortest circle if exists.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1246201486-7308-5-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
1,introduce match() to BFS in order to make it usable to
match different pattern;
2,also rename some functions to make them more suitable.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1246201486-7308-4-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>