kernel_samsung_sm7125

jenna

Author	SHA1	Message	Date
Paul E. McKenney	a682604838	rcu: Teach RCU that idle task is not quiscent state at boot This patch fixes a bug located by Vegard Nossum with the aid of kmemcheck, updated based on review comments from Nick Piggin, Ingo Molnar, and Andrew Morton. And cleans up the variable-name and function-name language. ;-) The boot CPU runs in the context of its idle thread during boot-up. During this time, idle_cpu(0) will always return nonzero, which will fool Classic and Hierarchical RCU into deciding that a large chunk of the boot-up sequence is a big long quiescent state. This in turn causes RCU to prematurely end grace periods during this time. This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks() function to ignore the idle task as a quiescent state until the system has started up the scheduler in rest_init(), introducing a new non-API function rcu_idle_now_means_idle() to inform RCU of this transition. RCU maintains an internal rcu_idle_cpu_truthful variable to track this state, which is then used by rcu_check_callback() to determine if it should believe idle_cpu(). Because this patch has the effect of disallowing RCU grace periods during long stretches of the boot-up sequence, this patch also introduces Josh Triplett's UP-only optimization that makes synchronize_rcu() be a no-op if num_online_cpus() returns 1. This allows boot-time code that calls synchronize_rcu() to proceed normally. Note, however, that RCU callbacks registered by call_rcu() will likely queue up until later in the boot sequence. Although rcuclassic and rcutree can also use this same optimization after boot completes, rcupreempt must restrict its use of this optimization to the portion of the boot sequence before the scheduler starts up, given that an rcupreempt RCU read-side critical section may be preeempted. In addition, this patch takes Nick Piggin's suggestion to make the system_state global variable be __read_mostly. Changes since v4: o Changes the name of the introduced function and variable to be less emotional. ;-) Changes since v3: o WARN_ON(nr_context_switches() > 0) to verify that RCU switches out of boot-time mode before the first context switch, as suggested by Nick Piggin. Changes since v2: o Created rcu_blocking_is_gp() internal-to-RCU API that determines whether a call to synchronize_rcu() is itself a grace period. o The definition of rcu_blocking_is_gp() for rcuclassic and rcutree checks to see if but a single CPU is online. o The definition of rcu_blocking_is_gp() for rcupreempt checks to see both if but a single CPU is online and if the system is still in early boot. This allows rcupreempt to again work correctly if running on a single CPU after booting is complete. o Added check to rcupreempt's synchronize_sched() for there being but one online CPU. Tested all three variants both SMP and !SMP, booted fine, passed a short rcutorture test on both x86 and Power. Located-by: Vegard Nossum <vegard.nossum@gmail.com> Tested-by: Vegard Nossum <vegard.nossum@gmail.com> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	16 years ago
Lai Jiangshan	e4fa4c9701	rcu: add __cpuinit to rcu_init_percpu_data() Impact: reduce memory footprint add __cpuinit to rcu_init_percpu_data(), and this function's text will be discarded after boot when !CONFIG_HOTPLUG_CPU. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	16 years ago
Ingo Molnar	263ec6457b	cpumask: convert RCU implementations, fix Impact: cleanup This warning: kernel/rcuclassic.c: In function ‘rcu_start_batch’: kernel/rcuclassic.c:397: warning: passing argument 1 of ‘cpumask_andnot’ from incompatible pointer type triggers because one usage site of rcp->cpumask was not converted to to_cpumask(rcp->cpumask). There's no ill effects of this bug. Signed-off-by: Ingo Molnar <mingo@elte.hu>	16 years ago
Rusty Russell	bd232f97b3	cpumask: convert RCU implementations Impact: use new cpumask API. rcu_ctrlblk contains a cpumask, and it's highly optimized so I don't want a cpumask_var_t (ie. a pointer) for the CONFIG_CPUMASK_OFFSTACK case. It could use a dangling bitmap, and be allocated in __rcu_init to save memory, but for the moment we use a bitmap. (Eventually 'struct cpumask' will be undefined for CONFIG_CPUMASK_OFFSTACK, so we use a bitmap here to show we really mean it). We remove on-stack cpumasks, using cpumask_var_t for rcu_torture_shuffle_tasks() and for_each_cpu_and in force_quiescent_state(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	16 years ago
Rusty Russell	6a7b3dc344	sched: convert nohz_cpu_mask to cpumask_var_t. Impact: (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	16 years ago
Ingo Molnar	be19ef82e0	rcu: make rcu-stall debug printout more standard Impact: change debug printout Change "RCU detected CPU stall" to "INFO: RCU detected CPU stall" message, to make it easier for tools to pick up the warning. Signed-off-by: Ingo Molnar <mingo@elte.hu>	16 years ago
Ingo Molnar	2ec2b482b1	rcu: RCU-based detection of stalled CPUs for Classic RCU, fix fix the !CONFIG_RCU_CPU_STALL_DETECTOR path: kernel/rcuclassic.c: In function '__rcu_pending': kernel/rcuclassic.c:609: error: too few arguments to function 'check_cpu_stall' Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Paul E. McKenney	2133b5d7ff	rcu: RCU-based detection of stalled CPUs for Classic RCU This patch adds stalled-CPU detection to Classic RCU. This capability is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR, which defaults disabled. This is a debugging feature to detect infinite loops in kernel code, not something that non-kernel-hackers would be expected to care about. This feature can detect looping CPUs in !PREEMPT builds and looping CPUs with preemption disabled in PREEMPT builds. This is essentially a port of this functionality from the treercu patch, replacing the stall debug patch that is already in tip/core/rcu (commit `67182ae1c4`). The changes from the patch in tip/core/rcu include making the config variable name match that in treercu, changing from seconds to jiffies to avoid spurious warnings, and printing a boot message when this feature is enabled. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Harvey Harrison	3a72dc8eb5	rcu: fix sparse shadowed variable warning kernel/rcuclassic.c:564:18: warning: symbol 'flags' shadows an earlier one kernel/rcuclassic.c:527:16: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Hiroshi Shimamoto	0cd418ddb1	rcuclassic: fix compiler warning CC kernel/rcuclassic.o kernel/rcuclassic.c: In function 'rcu_init_percpu_data': kernel/rcuclassic.c:705: warning: comparison of distinct pointer types lacks a cast kernel/rcuclassic.c:713: warning: comparison of distinct pointer types lacks a cast flags should be unsigned long. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Paul E. McKenney	275a89bdd3	rcu: use irq-safe locks Some earlier tip/core/rcu patches caused RCU to incorrectly enable irqs too early in boot. This caused Yinghai's repeated-kexec testing to hit oopses, presumably due to so that device interrupts left over from the prior kernel instance (which would oops the newly booting kernel before it got a chance to reset said devices). This patch therefore converts all the local_irq_disable()s in rcuclassic.c to local_irq_save(). Besides, I never did like local_irq_disable() anyway. ;-) Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Hiroshi Shimamoto	0c925d7923	rcuclassic: fix compilation NG fix: CC kernel/rcuclassic.o kernel/rcuclassic.c: In function '__rcu_process_callbacks': kernel/rcuclassic.c:561: error: 'flags' undeclared (first use in this function) kernel/rcuclassic.c:561: error: (Each undeclared identifier is reported only once kernel/rcuclassic.c:561: error: for each function it appears in.) Declare missing variable flags. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Paul E. McKenney	eff9b713ee	rcu: fix locking cleanup fallout Given that the rcp->lock is now acquired from call_rcu(), which can be invoked from irq-disable regions, all acquisitions need to disable irqs. The following patch fixes this. Although I don't have any reason to believe that this is the cause of Yinghai's oops, it does need to be fixed. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Paul E. McKenney	cd95851785	rcu: fix classic RCU locking cleanup lockdep problem On Fri, Aug 15, 2008 at 04:24:30PM +0200, Ingo Molnar wrote: > > Paul, > > one of your two recent RCU patches caused this lockdep splat in -tip > testing: > > -------------------> > Brought up 2 CPUs > Total of 2 processors activated (6850.87 BogoMIPS). > PM: Adding info for No Bus:platform > khelper used greatest stack depth: 3124 bytes left > > ================================= > [ INFO: inconsistent lock state ] > 2.6.27-rc3-tip #1 > --------------------------------- > inconsistent {softirq-on-W} -> {in-softirq-W} usage. > ksoftirqd/0/4 [HC0[0]:SC1[1]:HE1:SE0] takes: > (&rcu_ctrlblk.lock){-+..}, at: [<c016d91c>] __rcu_process_callbacks+0x1ac/0x1f0 > {softirq-on-W} state was registered at: > [<c01528e4>] __lock_acquire+0x3f4/0x5b0 > [<c0152b29>] lock_acquire+0x89/0xc0 > [<c076142b>] _spin_lock+0x3b/0x70 > [<c016d649>] rcu_init_percpu_data+0x29/0x80 > [<c075e43f>] rcu_cpu_notify+0xaf/0xd0 > [<c076458d>] notifier_call_chain+0x2d/0x60 > [<c0145ede>] __raw_notifier_call_chain+0x1e/0x30 > [<c075db29>] _cpu_up+0x79/0x110 > [<c075dc0d>] cpu_up+0x4d/0x70 > [<c0a769e1>] kernel_init+0xb1/0x200 > [<c01048a3>] kernel_thread_helper+0x7/0x10 > [<ffffffff>] 0xffffffff > irq event stamp: 14 > hardirqs last enabled at (14): [<c01534db>] trace_hardirqs_on+0xb/0x10 > hardirqs last disabled at (13): [<c014dbeb>] trace_hardirqs_off+0xb/0x10 > softirqs last enabled at (0): [<c012b186>] copy_process+0x276/0x1190 > softirqs last disabled at (11): [<c0105c0a>] call_on_stack+0x1a/0x30 > > other info that might help us debug this: > no locks held by ksoftirqd/0/4. > > stack backtrace: > Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.27-rc3-tip #1 > [<c01504dc>] print_usage_bug+0x16c/0x1b0 > [<c0152455>] mark_lock+0xa75/0xb10 > [<c0108b75>] ? sched_clock+0x15/0x30 > [<c015289d>] __lock_acquire+0x3ad/0x5b0 > [<c0152b29>] lock_acquire+0x89/0xc0 > [<c016d91c>] ? __rcu_process_callbacks+0x1ac/0x1f0 > [<c076142b>] _spin_lock+0x3b/0x70 > [<c016d91c>] ? __rcu_process_callbacks+0x1ac/0x1f0 > [<c016d91c>] __rcu_process_callbacks+0x1ac/0x1f0 > [<c016d986>] rcu_process_callbacks+0x26/0x50 > [<c0132305>] __do_softirq+0x95/0x120 > [<c0132270>] ? __do_softirq+0x0/0x120 > [<c0105c0a>] call_on_stack+0x1a/0x30 > [<c0132426>] ? ksoftirqd+0x96/0x110 > [<c0132390>] ? ksoftirqd+0x0/0x110 > [<c01411f7>] ? kthread+0x47/0x80 > [<c01411b0>] ? kthread+0x0/0x80 > [<c01048a3>] ? kernel_thread_helper+0x7/0x10 > ======================= > calling init_cpufreq_transition_notifier_list+0x0/0x20 > initcall init_cpufreq_transition_notifier_list+0x0/0x20 returned 0 after 0 msecs > calling net_ns_init+0x0/0x190 > net_namespace: 676 bytes > initcall net_ns_init+0x0/0x190 returned 0 after 0 msecs > calling cpufreq_tsc+0x0/0x20 > initcall cpufreq_tsc+0x0/0x20 returned 0 after 0 msecs > calling reboot_init+0x0/0x20 > initcall reboot_init+0x0/0x20 returned 0 after 0 msecs > calling print_banner+0x0/0x10 > Booting paravirtualized kernel on bare hardware > > <----------------------- > > my guess is on: > > commit `1f7b94cd3d` > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Date: Tue Aug 5 09:21:44 2008 -0700 > > rcu: classic RCU locking and memory-barrier cleanups > > Ingo Fixes a problem detected by lockdep in which rcu->lock was acquired both in irq context and in process context, but without disabling from process context. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Paul E. McKenney	1f7b94cd3d	rcu: classic RCU locking and memory-barrier cleanups This patch simplifies the locking and memory-barrier usage in the Classic RCU grace-period-detection mechanism, incorporating Lai Jiangshan's feedback from the earlier version (http://lkml.org/lkml/2008/8/1/400 and http://lkml.org/lkml/2008/8/3/43). Passed 10 hours of rcutorture concurrent with CPUs being put online and taken offline on a 128-hardware-thread Power machine. My apologies to whoever in the Eastern Hemisphere was planning to use this machine over the Western Hemisphere night, but it was sitting idle and... So this is ready for tip/core/rcu. This patch is in preparation for moving to a hierarchical algorithm to allow the very large SMP machines -- requested by some people at OLS, and there seem to have been a few recent patches in the 4096-CPU direction as well. The general idea is to move to a much more conservative concurrency design, then apply a hierarchy to reduce contention on the global lock by a few orders of magnitude (larger machines would see greater reductions). The reason for taking a conservative approach is that this code isn't on any fast path. Prototype in progress. This patch is against the linux-tip git tree (tip/core/rcu). If you wish to test this against 2.6.26, use the following set of patches: http://www.rdrop.com/users/paulmck/patches/2.6.26-ljsimp-1.patch http://www.rdrop.com/users/paulmck/patches/2.6.26-ljsimpfix-3.patch The first patch combines commits `5127bed588` and `3cac97cbb1` from Lai Jiangshan <laijs@cn.fujitsu.com>, and the second patch contains my changes. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Paul E. McKenney	293a17ebc9	rcu: prevent console flood when one CPU sees another AWOL via RCU One small change needed to keep from flooding the console when one CPU notices that another is AWOL. Unless I am missing something subtle. Otherwise the cleanups look good! Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Ingo Molnar	78635fc739	rcu, debug: detect stalled grace periods, cleanups small cleanups. Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Paul E. McKenney	67182ae1c4	rcu, debug: detect stalled grace periods this is a diagnostic patch for Classic RCU. The approach is to record a timestamp at the beginning of the grace period (in rcu_start_batch()), then have rcu_check_callbacks() complain if: 1. it is running on a CPU that has holding up grace periods for a long time (say one second). This will identify the culprit assuming that the culprit has not disabled hardware irqs, instruction execution, or some such. 2. it is running on a CPU that is not holding up grace periods, but grace periods have been held up for an even longer time (say two seconds). It is enabled via the default-off CONFIG_DEBUG_RCU_STALL kernel parameter. Rather than exponential backoff, it backs off to once per 30 seconds. My feeling upon thinking on it was that if you have stalled RCU grace periods for that long, a few extra printk() messages are probably the least of your worries... Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: David Witbrodt <dawitbro@sbcglobal.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Rusty Russell	9b1a4d3837	stop_machine: Wean existing callers off stop_machine_run() Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	17 years ago
Lai Jiangshan	5127bed588	rcu classic: new algorithm for callbacks-processing(v2) This is v2, it's a little deference from v1 that I had send to lkml. use ACCESS_ONCE use rcu_batch_after/rcu_batch_before for batch # comparison. rcutorture test result: (hotplugs: do cpu-online/offline once per second) No CONFIG_NO_HZ: OK, 12hours No CONFIG_NO_HZ, hotplugs: OK, 12hours CONFIG_NO_HZ=y: OK, 24hours CONFIG_NO_HZ=y, hotplugs: Failed. (Failed also without my patch applied, exactly the same bug occurred, http://lkml.org/lkml/2008/7/3/24) v1's email thread: http://lkml.org/lkml/2008/6/2/539 v1's description: The code/algorithm of the implement of current callbacks-processing is very efficient and technical. But when I studied it and I found a disadvantage: In multi-CPU systems, when a new RCU callback is being queued(call_rcu[_bh]), this callback will be invoked after the grace period for the batch with batch number = rcp->cur+2 has completed very very likely in current implement. Actually, this callback can be invoked after the grace period for the batch with batch number = rcp->cur+1 has completed. The delay of invocation means that latency of synchronize_rcu() is extended. But more important thing is that the callbacks usually free memory, and these works are delayed too! it's necessary for reclaimer to free memory as soon as possible when left memory is few. A very simple way can solve this problem: a field(struct rcu_head::batch) is added to record the batch number for the RCU callback. And when a new RCU callback is being queued, we determine the batch number for this callback(head->batch = rcp->cur+1) and we move this callback to rdp->donelist if we find that head->batch <= rcp->completed when we process callbacks. This simple way reduces the wait time for invocation a lot. (about 2.5Grace Period -> 1.5Grace Period in average in multi-CPU systems) This is my algorithm. But I do not add any field for struct rcu_head in my implement. We just need to memorize the last 2 batches and their batch number, because these 2 batches include all entries that for whom the grace period hasn't completed. So we use a special linked-list rather than add a field. Please see the comment of struct rcu_data. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Gautham Shenoy <ego@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Lai Jiangshan	3cac97cbb1	rcu classic: simplify the next pending batch use a batch number(rcp->pending) instead of a flag(rcp->next_pending) rcu_start_batch() need to change this flag, so mb()s is needed for memory-access safe. but(after this patch applied) rcu_start_batch() do not change this batch number(rcp->pending), rcp->pending is managed by __rcu_process_callbacks only, and troublesome mb()s are eliminated. And codes look simpler and clearer. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Gautham Shenoy <ego@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Lai Jiangshan	199a952876	rcu classic: update qlen when cpu offline When callbacks are moved from offline cpu to this cpu, the qlen field of this rdp should be updated. [ Paul E. McKenney: ] The effect of this bug would be for force_quiescent_state() to be invoked when it should not and vice versa -- wasting cycles in the first case and letting RCU callbacks remain piled up in the second case. The bug is thus "benign" in that it does not result in premature grace-period termination, but should of course be fixed nonetheless. Preemption is disabled by the caller's get_cpu_var(), so we are guaranteed to remain on the same CPU, as required. The local_irq_disable() is indeed needed, otherwise, an interrupt might invoke call_rcu() or call_rcu_bh(), which could cause that interrupt's increment of ->qlen to be lost. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Gautham R Shenoy	8558f8f816	rcu: fix hotplug vs rcu race Dhaval Giani reported this warning during cpu hotplug stress-tests: \| On running kernel compiles in parallel with cpu hotplug: \| \| WARNING: at arch/x86/kernel/smp.c:118 \| native_smp_send_reschedule+0x21/0x36() \| Modules linked in: \| Pid: 27483, comm: cc1 Not tainted 2.6.26-rc7 #1 \| [...] \| [<c0110355>] native_smp_send_reschedule+0x21/0x36 \| [<c014fe8f>] force_quiescent_state+0x47/0x57 \| [<c014fef0>] call_rcu+0x51/0x6d \| [<c01713b3>] __fput+0x130/0x158 \| [<c0171231>] fput+0x17/0x19 \| [<c016fd99>] filp_close+0x4d/0x57 \| [<c016fdff>] sys_close+0x5c/0x97 IMHO the warning is a spurious one. cpu_online_map is updated by the _cpu_down() using stop_machine_run(). Since force_quiescent_state is invoked from irqs disabled section, stop_machine_run() won't be executing while a cpu is executing force_quiescent_state(). Hence the cpu_online_map is stable while we're in the irq disabled section. However, a cpu might have been offlined _just_ before we disabled irqs while entering force_quiescent_state(). And rcu subsystem might not yet have handled the CPU_DEAD notification, leading to the offlined cpu's bit being set in the rcp->cpumask. Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent sending smp_reschedule() to an offlined CPU. Here's the timeline: CPU_A CPU_B -------------------------------------------------------------- cpu_down(): . . . . . stop_machine(): /* disables preemption, . * and irqs / . . . . . take_cpu_down(); . . . . . . . cpu_disable(); /this removes cpu . from cpu_online_map . / . . . . . restart_machine(); /* enables irqs / . ------WINDOW DURING WHICH rcp->cpumask is stale --------------- . call_rcu(); . / disables irqs here / . .force_quiescent_state(); .CPU_DEAD: .for_each_cpu(rcp->cpumask) . . smp_send_reschedule(); . . . . WARN_ON() for offlined CPU! . . . rcu_cpu_notify: . -------- WINDOW ENDS ------------------------------------------ rcu_offline_cpu() / Which calls cpu_quiet() * which removes * cpu from rcp->cpumask. */ If a new batch was started just before calling stop_machine_run(), the "tobe-offlined" cpu is still present in rcp-cpumask. During a cpu-offline, from take_cpu_down(), we queue an rt-prio idle task as the next task to be picked by the scheduler. We also call cpu_disable() which will disable any further interrupts and remove the cpu's bit from the cpu_online_map. Once the stop_machine_run() successfully calls take_cpu_down(), it calls schedule(). That's the last time a schedule is called on the offlined cpu, and hence the last time when rdp->passed_quiesc will be set to 1 through rcu_qsctr_inc(). But the cpu_quiet() will be on this cpu will be called only when the next RCU_SOFTIRQ occurs on this CPU. So at this time, the offlined CPU is still set in rcp->cpumask. Now coming back to the idle_task which truely offlines the CPU, it does check for a pending RCU and raises the softirq, since it will find rdp->passed_quiesc to be 0 in this case. However, since the cpu is offline I am not sure if the softirq will trigger on the CPU. Even if it doesn't the rcu_offline_cpu() will find that rcp->completed is not the same as rcp->cur, which means that our cpu could be holding up the grace period progression. Hence we call cpu_quiet() and move ahead. But because of the window explained in the timeline, we could still have a call_rcu() before the RCU subsystem executes it's CPU_DEAD notification, and we send smp_send_reschedule() to offlined cpu while trying to force the quiescent states. The appended patch adds comments and prevents checking for offlined cpu everytime. cpu_online_map is updated by the _cpu_down() using stop_machine_run(). Since force_quiescent_state is invoked from irqs disabled section, stop_machine_run() won't be executing while a cpu is executing force_quiescent_state(). Hence the cpu_online_map is stable while we're in the irq disabled section. Reported-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Acked-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: laijs@cn.fujitsu.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rusty Russel <rusty@rustcorp.com.au> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Carlos R. Mafra	962cf36c5b	Remove argument from open_softirq which is always NULL As git-grep shows, open_softirq() is always called with the last argument being NULL block/blk-core.c: open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL); kernel/hrtimer.c: open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL); kernel/rcuclassic.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL); kernel/rcupreempt.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL); kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL); kernel/softirq.c: open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL); kernel/softirq.c: open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL); kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL); net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL); net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL); This observation has already been made by Matthew Wilcox in June 2002 (http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html) "I notice that none of the current softirq routines use the data element passed to them." and the situation hasn't changed since them. So it appears we can safely remove that extra argument to save 128 (54) bytes of kernel data (text). Signed-off-by: Carlos R. Mafra <crmafra@ift.unesp.br> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	17 years ago
Mike Travis	363ab6f142	core: use performance variant for_each_cpu_mask_nr Change references from for_each_cpu_mask to for_each_cpu_mask_nr where appropriate Reviewed-by: Paul Jackson <pj@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	17 years ago
Paul E. McKenney	8db559b830	rcu: add memory barriers and comments to rcu_check_callbacks() Add comments to the logic that infers quiescent states when interrupting from either user mode or the idle loop. Also add a memory barrier: it appears that James Huang was in fact onto something, as the scheduler is much less synchronization happy than it once was, so we can no longer rely on its memory barriers in all cases. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reported-by: James Huang <jamesclhuang@yahoo.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	17 years ago
Paul E. McKenney	e260be673a	Preempt-RCU: implementation This patch implements a new version of RCU which allows its read-side critical sections to be preempted. It uses a set of counter pairs to keep track of the read-side critical sections and flips them when all tasks exit read-side critical section. The details of this implementation can be found in this paper - http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf and the article- http://lwn.net/Articles/253651/ This patch was developed as a part of the -rt kernel development and meant to provide better latencies when read-side critical sections of RCU don't disable preemption. As a consequence of keeping track of RCU readers, the readers have a slight overhead (optimizations in the paper). This implementation co-exists with the "classic" RCU implementations and can be switched to at compiler. Also includes RCU tracing summarized in debugfs. [ akpm@linux-foundation.org: build fixes on non-preempt architectures ] Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Paul E. McKenney	e0ecfa7917	Preempt-RCU: fix rcu_barrier for preemptive environment. Fix rcu_barrier() to work properly in preemptive kernel environment. Also, the ordering of callback must be preserved while moving callbacks to another CPU during CPU hotplug. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Paul E. McKenney	01c1c660f4	Preempt-RCU: reorganize RCU code into rcuclassic.c and rcupdate.c This patch re-organizes the RCU code to enable multiple implementations of RCU. Users of RCU continues to include rcupdate.h and the RCU interfaces remain the same. This is in preparation for subsequently merging the preemptible RCU implementation. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Dipankar Sarma	c2d727aa2f	Preempt-RCU: Use softirq instead of tasklets for This patch makes RCU use softirq instead of tasklets. It also adds a memory barrier after raising the softirq inorder to ensure that the cpu sees the most recently updated value of rcu->cur while processing callbacks. The discussion of the related theoretical race pointed out by James Huang can be found here --> http://lkml.org/lkml/2007/11/20/603 Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	17 years ago
Randy Dunlap	00e10776ff	rcu: fix section mismatch rcu_online_cpu() should be __cpuinit instead of __devinit. WARNING: vmlinux.o(.text+0x4b6d5): Section mismatch: reference to .init.text: (between 'rcu_cpu_notify' and 'wakeme_after_rcu') Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	17 years ago
Jesper Juhl	a9022e9cb9	Clean up duplicate includes in kernel/ This patch cleans up duplicate includes in kernel/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	18 years ago
Peter Zijlstra	851a67b825	lockdep: annotate rcu_read_{,un}lock{,_bh} lockdep annotate rcu_read_{,un}lock{,_bh} in order to catch imbalanced usage. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	18 years ago
Rafael J. Wysocki	8bb7844286	Add suspend-related notifications for CPU hotplug Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	18 years ago
Eric Dumazet	1c69d921ed	[PATCH] rcu: add a prefetch() in rcu_do_batch() On some workloads, (for example when lot of close() syscalls are done), RCU qlen can be quite large, and RCU heads are no longer in cpu cache when rcu_do_batch() is called. This patch adds a prefetch() in rcu_do_batch() to give CPU a hint to bring back cache lines containing 'struct rcu_head's. Most list manipulations macros include prefetch(), but not open coded ones (at least with current C compilers :) ) I got a nice speedup on a trivial benchmark (3.48 us per iteration instead of 3.95 us on a 1.6 GHz Pentium-M) while (1) { pipe(p); close(fd[0]); close(fd[1]);} Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	18 years ago
Oleg Nesterov	20e9751bd9	[PATCH] rcu: simplify/improve batch tuning Kill a hard-to-calculate 'rsinterval' boot parameter and per-cpu rcu_data.last_rs_qlen. Instead, it adds adds a flag rcu_ctrlblk.signaled, which records the fact that one of CPUs has sent a resched IPI since the last rcu_start_batch(). Roughly speaking, we need two rcu_start_batch()s in order to move callbacks from ->nxtlist to ->donelist. This means that when ->qlen exceeds qhimark and continues to grow, we should send a resched IPI, and then do it again after we gone through a quiescent state. On the other hand, if it was already sent, we don't need to do it again when another CPU detects overflow of the queue. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Oleg Nesterov	dd9daa221e	[PATCH] rcu_do_batch: make ->qlen decrement irq safe rcu_do_batch() decrements rdp->qlen with irqs enabled. This is not good, it can also be modified by call_rcu() from interrupt. Decrement ->qlen once with irqs disabled, after a main loop. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Chandra Seetharaman	8c78f3075d	[PATCH] cpu hotplug: replace __devinit* with __cpuinit* for cpu notifications Few of the callback functions and notifier blocks that are associated with cpu notifications incorrectly have __devinit and __devinitdata. They should be __cpuinit and __cpuinitdata instead. It makes no functional difference but wastes text area when CONFIG_HOTPLUG is enabled and CONFIG_HOTPLUG_CPU is not. This patch fixes all those instances. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Ingo Molnar	e4d9191885	[PATCH] lockdep: locking init debugging improvement Locking init improvement: - introduce and use __SPIN_LOCK_UNLOCKED for array initializations, to pass in the name string of locks, used by debugging Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Chandra Seetharaman	054cc8a2d8	[PATCH] cpu hotplug: revert initdata patch submitted for 2.6.17 This patch reverts notifier_block changes made in 2.6.17 Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Chandra Seetharaman	9c7b216d23	[PATCH] cpu hotplug: revert init patch submitted for 2.6.17 In 2.6.17, there was a problem with cpu_notifiers and XFS. I provided a band-aid solution to solve that problem. In the process, i undid all the changes you both were making to ensure that these notifiers were available only at init time (unless CONFIG_HOTPLUG_CPU is defined). We deferred the real fix to 2.6.18. Here is a set of patches that fixes the XFS problem cleanly and makes the cpu notifiers available only at init time (unless CONFIG_HOTPLUG_CPU is defined). If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run time. This patch reverts the notifier_call changes made in 2.6.17 Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Paul E. McKenney	c32e066057	[PATCH] rcutorture: add call_rcu_bh() operations Add operations for the call_rcu_bh() variant of RCU. Also add an rcu_batches_completed_bh() function, which is needed by rcutorture. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Paul E. McKenney	d83015b8f6	[PATCH] Make RCU API inaccessible to non-GPL Linux kernel modules Remove synchronize_kernel() (deprecated 2-APR-2005 in http://lkml.org/lkml/2005/4/3/11) and makes the RCU API inaccessible to non-GPL Linux kernel modules (as was announced more than one year ago in http://lkml.org/lkml/2005/4/3/8). Tested on x86 and ppc64. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Heiko Carstens	986733e01d	[PATCH] RCU: introduce rcu_needs_cpu() interface With "Paul E. McKenney" <paulmck@us.ibm.com> Introduce rcu_needs_cpu() interface. This can be used to tell if there will be a new rcu batch on a cpu soon by looking at the curlist pointer. This can be used to avoid to enter a tickless idle state where the cpu would miss that a new batch is ready when rcu_start_batch would be called on a different cpu. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Chandra Seetharaman	83d722f7e1	[PATCH] Remove __devinit and __cpuinit from notifier_call definitions Few of the notifier_chain_register() callers use __init in the definition of notifier_call. It is incorrect as the function definition should be available after the initializations (they do not unregister them during initializations). This patch fixes all such usages to _not_ have the notifier_call __init section. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Chandra Seetharaman	649bbaa484	[PATCH] Remove __devinitdata from notifier block definitions Few of the notifier_chain_register() callers use __devinitdata in the definition of notifier_block data structure. It is incorrect as the data structure should be available after the initializations (they do not unregister them during initializations). This was leading to an oops when notifier_chain_register() call is invoked for those callback chains after initialization. This patch fixes all such usages to _not_ have the notifier_block data structure in the init data section. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Oleg Nesterov	caa9ee771d	[PATCH] rcu_process_callbacks: don't cli() while testing ->nxtlist __rcu_process_callbacks() disables interrupts to protect itself from call_rcu() which adds new entries to ->nxtlist. However we can check "->nxtlist != NULL" with interrupts enabled, we can't get "false positives" because call_rcu() can only change this condition from 0 to 1. Tested with rcutorture.ko. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Dipankar Sarma <dipankar@in.ibm.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Adrian Bunk	2178426d26	[PATCH] kernel/rcupdate.c: make two structs static This patch makes two needlessly global structs static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Ingo Molnar	9331b3157c	[PATCH] convert kernel/rcupdate.c:rcu_barrier_sema to mutex Convert kernel/rcupdate's rcu_barrier_sema to mutex. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	19 years ago
Greg Kroah-Hartman	01ca70dca5	[PATCH] add EXPORT_SYMBOL_GPL_FUTURE() to RCU subsystem As the RCU symbols are going to be changed to GPL in the near future, lets warn users that this is going to happen. Cc: Paul McKenney <paulmck@us.ibm.com> Acked-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	19 years ago

36 Commits (06164f3194e01ea4c76941ac60f541d656c8975f)