kernel_samsung_sm7125

jenna

Author	SHA1	Message	Date
Daniel Lezcano	f9ae39d04c	tick: Pass broadcast device to tick_broadcast_set_event() Pass the broadcast timer to tick_broadcast_set_event() instead of reevaluating tick_broadcast_device.evtdev. [ tglx: Massaged changelog ] Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: viresh.kumar@linaro.org Cc: jacob.jun.pan@linux.intel.com Cc: linux-arm-kernel@lists.infradead.org Cc: santosh.shilimkar@ti.com Cc: linaro-kernel@lists.linaro.org Cc: patches@linaro.org Cc: rickard.andersson@stericsson.com Cc: vincent.guittot@linaro.org Cc: linus.walleij@stericsson.com Cc: john.stultz@linaro.org Link: http://lkml.kernel.org/r/1362219013-18173-2-git-send-email-daniel.lezcano@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	12 years ago
Thomas Gleixner	b352bc1cbc	tick: Convert broadcast cpu bitmaps to cpumask_var_t Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130306111537.366394000@linutronix.de Cc: Rusty Russell <rusty@rustcorp.com.au>	12 years ago
Thomas Gleixner	af7bdbafe3	Revert "nohz: Make tick_nohz_irq_exit() irq safe" This reverts commit 351429b2e62b6545bb10c756686393f29ba268a1. The extra local_irq_save() is not longer needed as the call site now always calls with interrupts disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linuxfoundation.org>	12 years ago
Frederic Weisbecker	e5ab012c32	nohz: Make tick_nohz_irq_exit() irq safe As it stands, irq_exit() may or may not be called with irqs disabled, depending on __ARCH_IRQ_EXIT_IRQS_DISABLED that the arch can define. It makes tick_nohz_irq_exit() unsafe. For example two interrupts can race in tick_nohz_stop_sched_tick(): the inner most one computes the expiring time on top of the timer list, then it's interrupted right before reprogramming the clock. The new interrupt enqueues a new timer list timer, it reprogram the clock to take it into account and it exits. The CPUs resumes the inner most interrupt and performs the clock reprogramming without considering the new timer list timer. This regression has been introduced by: `280f06774a` ("nohz: Separate out irq exit and idle loop dyntick logic") Let's fix it right now with the appropriate protections. A saner long term solution will be to remove __ARCH_IRQ_EXIT_IRQS_DISABLED and mandate that irq_exit() is called with interrupts disabled. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: <stable@vger.kernel.org> #v3.2+ Link: http://lkml.kernel.org/r/1361373336-11337-1-git-send-email-fweisbec@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	12 years ago
Thomas Gleixner	a6c0c943a1	ntp: Make ntp_lock raw seconds_overflow() is called from hard interrupt context even on Preempt-RT. This requires the lock to be a raw_spinlock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	12 years ago
Mark Rutland	5d1d9a29bc	clockevents: Fix generic broadcast for FEAT_C3STOP Commit 12ad100046: "clockevents: Add generic timer broadcast function" made tick_device_uses_broadcast set up the generic broadcast function for dummy devices (where !tick_device_is_functional(dev)), but neglected to set up the broadcast function for devices that stop in low power states (with the CLOCK_EVT_FEAT_C3STOP flag). When these devices enter low power states they will not have the generic broadcast function assigned, and will bring down the system when an attempt is made to broadcast to them. This patch ensures that the broadcast function is also assigned for devices which require broadcast in low power states. Reported-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> Cc: linux-arm-kernel@lists.infradead.org Cc: nico@linaro.org Cc: Marc.Zyngier@arm.com Cc: Will.Deacon@arm.com Cc: santosh.shilimkar@ti.com Cc: john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	12 years ago
Prarit Bhargava	84e345e4e2	time, Fix setting of hardware clock in NTP code At init time, if the system time is "warped" forward in warp_clock() it will differ from the hardware clock by sys_tz.tz_minuteswest. This time difference is not taken into account when ntp updates the hardware clock, and this causes the system time to jump forward by this offset every reboot. The kernel must take this offset into account when writing the system time to the hardware clock in the ntp code. This patch adds persistent_clock_is_local which indicates that an offset has been applied in warp_clock() and accounts for the "warp" before writing the hardware clock. x86 does not have this problem as rtc writes are software limited to a +/-15 minute window relative to the current rtc time. Other arches, such as powerpc, however do a full synchronization of the system time to the rtc and will see this problem. [v2]: generated against tip/timers/core Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	12 years ago
Mark Rutland	12ad100046	clockevents: Add generic timer broadcast function Currently, the timer broadcast mechanism is defined by a function pointer on struct clock_event_device. As the fundamental mechanism for broadcast is architecture-specific, this means that clock_event_device drivers cannot be shared across multiple architectures. This patch adds an (optional) architecture-specific function for timer tick broadcast, allowing drivers which may require broadcast functionality to be shared across multiple architectures. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: linux-arm-kernel@lists.infradead.org Cc: nico@linaro.org Cc: Will.Deacon@arm.com Cc: Marc.Zyngier@arm.com Cc: john.stultz@linaro.org Link: http://lkml.kernel.org/r/1358183124-28461-3-git-send-email-mark.rutland@arm.com Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	12 years ago
Mark Rutland	12572dbb53	clockevents: Add generic timer broadcast receiver Currently the broadcast mechanism used for timers is abstracted by a function pointer on struct clock_event_device. As the fundamental mechanism for broadcast is architecture-specific, this ties each clock_event_device driver to a single architecture, even where the driver is otherwise generic. This patch adds a standard path for the receipt of timer broadcasts, so drivers and/or architecture backends need not manage redundant lists of timers for the purpose of routing broadcast timer ticks. [tglx: Made the implementation depend on the config switch as well ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: linux-arm-kernel@lists.infradead.org Cc: nico@linaro.org Cc: Will.Deacon@arm.com Cc: Marc.Zyngier@arm.com Cc: john.stultz@linaro.org Link: http://lkml.kernel.org/r/1358183124-28461-2-git-send-email-mark.rutland@arm.com Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	12 years ago
John Stultz	6f16eebe1f	timekeeping: Switch HAS_PERSISTENT_CLOCK to ALWAYS_USE_PERSISTENT_CLOCK Jason pointed out the HAS_PERSISTENT_CLOCK name isn't quite accurate for the config, as some systems may have the persistent_clock in some cases, but not always. So change the config name to the more clear ALWAYS_USE_PERSISTENT_CLOCK. Signed-off-by: John Stultz <john.stultz@linaro.org>	12 years ago
Frederic Weisbecker	3f4724ea85	cputime: Allow dynamic switch between tick/virtual based cputime accounting Allow to dynamically switch between tick and virtual based cputime accounting. This way we can provide a kind of "on-demand" virtual based cputime accounting. In this mode, the kernel relies on the context tracking subsystem to dynamically probe on kernel boundaries. This is in preparation for being able to stop the timer tick in more places than just the idle state. Doing so will depend on CONFIG_VIRT_CPU_ACCOUNTING_GEN which makes it possible to account the cputime without the tick by hooking on kernel/user boundaries. Depending whether the tick is stopped or not, we can switch between tick and vtime based accounting anytime in order to minimize the overhead associated to user hooks. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>	12 years ago
Jacob Pan	4dbd27711c	tick: export nohz tick idle symbols for module use Allow drivers such as intel_powerclamp to use these apis for turning on/off ticks during idle. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>	12 years ago
Feng Tang	05ad717c77	timekeeping: Add CONFIG_HAS_PERSISTENT_CLOCK option Make the persistent clock check a kernel config option, so that some platform can explicitely select it, also make CONFIG_RTC_HCTOSYS and RTC_SYSTOHC depend on its non-existence, which could prevent the persistent clock and RTC code from doing similar thing twice during system's init/suspend/resume phases. If the CONFIG_HAS_PERSISTENT_CLOCK=n, then no change happens for kernel which still does the persistent clock check in timekeeping_init(). Cc: Thomas Gleixner <tglx@linutronix.de> Suggested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Feng Tang <feng.tang@intel.com> [jstultz: Added dependency for RTC_SYSTOHC as well] Signed-off-by: John Stultz <john.stultz@linaro.org>	12 years ago
Feng Tang	31ade30692	timekeeping: Add persistent_clock_exist flag In current kernel, there are several places which need to check whether there is a persistent clock for the platform. Current check is done by calling the read_persistent_clock() and validating its return value. So one optimization is to do the check only once in timekeeping_init(), and use a flag persistent_clock_exist to record it. v2: Add a has_persistent_clock() helper function, as suggested by John. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	12 years ago
Jason Gunthorpe	023f333a99	NTP: Add a CONFIG_RTC_SYSTOHC configuration The purpose of this option is to allow ARM/etc systems that rely on the class RTC subsystem to have the same kind of automatic NTP based synchronization that we have on PC platforms. Today ARM does not implement update_persistent_clock and makes extensive use of the class RTC system. When enabled CONFIG_RTC_SYSTOHC will provide a generic rtc_update_persistent_clock that stores the current time in the RTC and is intended complement the existing CONFIG_RTC_HCTOSYS option that loads the RTC at boot. Like with RTC_HCTOSYS the platform's update_persistent_clock is used first, if it works. Platforms with mixed class RTC and non-RTC drivers need to return ENODEV when class RTC should be used. Such an update for PPC is included in this patch. Long term, implementations of update_persistent_clock should migrate to proper class RTC drivers and use CONFIG_RTC_SYSTOHC instead. Tested on ARM kirkwood and PPC405 Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	12 years ago
Kees Cook	1e817fb62c	time: create __getnstimeofday for WARNless calls The pstore RAM backend can get called during resume, and must be defensive against a suspended time source. Expose getnstimeofday logic that returns an error instead of a WARN. This can be detected and the timestamp can be zeroed out. Reported-by: Doug Anderson <dianders@chromium.org> Cc: John Stultz <johnstul@us.ibm.com> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: John Stultz <john.stultz@linaro.org>	12 years ago
Shawn Guo	c35ef95c27	clockevents: export clockevents_config_and_register for module use clockevents_config_and_register is a handy helper for clockevent drivers, some of which might support module build, so export the symbol. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Olof Johansson <olof@lixom.net>	12 years ago
Stephen Warren	7b1f62076b	time: convert arch_gettimeoffset to a pointer Currently, whenever CONFIG_ARCH_USES_GETTIMEOFFSET is enabled, each arch core provides a single implementation of arch_gettimeoffset(). In many cases, different sub-architectures, different machines, or different timer providers exist, and so the arch ends up implementing arch_gettimeoffset() as a call-through-pointer anyway. Examples are ARM, Cris, M68K, and it's arguable that the remaining architectures, M32R and Blackfin, should be doing this anyway. Modify arch_gettimeoffset so that it itself is a function pointer, which the arch initializes. This will allow later changes to move the initialization of this function into individual machine support or timer drivers. This is particularly useful for code in drivers/clocksource which should rely on an arch-independant mechanism to register their implementation of arch_gettimeoffset(). This patch also converts the Cris architecture to set arch_gettimeoffset directly to the final implementation in time_init(), because Cris already had separate time_init() functions per sub-architecture. M68K and ARM are converted to set arch_gettimeoffset to the final implementation in later patches, because they already have function pointers in place for this purpose. Cc: Russell King <linux@arm.linux.org.uk> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>	12 years ago
Marcelo Tosatti	e0b306fef9	time: export time information for KVM pvclock As suggested by John, export time data similarly to how its done by vsyscall support. This allows KVM to retrieve necessary information to implement vsyscall support in KVM guests. Acked-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	12 years ago
Frederic Weisbecker	74876a98a8	printk: Wake up klogd using irq_work klogd is woken up asynchronously from the tick in order to do it safely. However if printk is called when the tick is stopped, the reader won't be woken up until the next interrupt, which might not fire for a while. As a result, the user may miss some message. To fix this, lets implement the printk tick using a lazy irq work. This subsystem takes care of the timer tick state and can fix up accordingly. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>	12 years ago
Frederic Weisbecker	00b4295910	irq_work: Don't stop the tick with pending works Don't stop the tick if we have pending irq works on the queue, otherwise if the arch can't raise self-IPIs, we may not find an opportunity to execute the pending works for a while. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>	12 years ago
Frederic Weisbecker	33a5f6261a	nohz: Add API to check tick state We need some quick way to check if the CPU has stopped its tick. This will be useful to implement the printk tick using the irq work subsystem. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>	12 years ago
Youquan Song	69a37beabf	cpuidle: Quickly notice prediction failure for repeat mode The prediction for future is difficult and when the cpuidle governor prediction fails and govenor possibly choose the shallower C-state than it should. How to quickly notice and find the failure becomes important for power saving. cpuidle menu governor has a method to predict the repeat pattern if there are 8 C-states residency which are continuous and the same or very close, so it will predict the next C-states residency will keep same residency time. There is a real case that turbostat utility (tools/power/x86/turbostat) at kernel 3.3 or early. turbostat utility will read 10 registers one by one at Sandybridge, so it will generate 10 IPIs to wake up idle CPUs. So cpuidle menu governor will predict it is repeat mode and there is another IPI wake up idle CPU soon, so it keeps idle CPU stay at C1 state even though CPU is totally idle. However, in the turbostat, following 10 registers reading is sleep 5 seconds by default, so the idle CPU will keep at C1 for a long time though it is idle until break event occurs. In a idle Sandybridge system, run "./turbostat -v", we will notice that deep C-state dangles between "70% ~ 99%". After patched the kernel, we will notice deep C-state stays at >99.98%. In the patch, a timer is added when menu governor detects a repeat mode and choose a shallow C-state. The timer is set to a time out value that greater than predicted time, and we conclude repeat mode prediction failure if timer is triggered. When repeat mode happens as expected, the timer is not triggered and CPU waken up from C-states and it will cancel the timer initiatively. When repeat mode does not happen, the timer will be time out and menu governor will quickly notice that the repeat mode prediction fails and then re-evaluates deeper C-states possibility. Below is another case which will clearly show the patch much benefit: #include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <signal.h> #include <sys/time.h> #include <time.h> #include <pthread.h> volatile int * shutdown; volatile long * count; int delay = 20; int loop = 8; void usage(void) { fprintf(stderr, "Usage: idle_predict [options]\n" " --help -h Print this help\n" " --thread -n Thread number\n" " --loop -l Loop times in shallow Cstate\n" " --delay -t Sleep time (uS)in shallow Cstate\n"); } void simple_loop() { int idle_num = 1; while (!(shutdown)) { count = count + 1; if (idle_num % loop) usleep(delay); else { /* sleep 1 second / usleep(1000000); idle_num = 0; } idle_num++; } } static void sighand(int sig) { shutdown = 1; } int main(int argc, char argv[]) { sigset_t sigset; int signum = SIGALRM; int i, c, er = 0, thread_num = 8; pthread_t pt[1024]; static char optstr[] = "n:l:t:h:"; while ((c = getopt(argc, argv, optstr)) != EOF) switch (c) { case 'n': thread_num = atoi(optarg); break; case 'l': loop = atoi(optarg); break; case 't': delay = atoi(optarg); break; case 'h': default: usage(); exit(1); } printf("thread=%d,loop=%d,delay=%d\n",thread_num,loop,delay); count = malloc(sizeof(long)); shutdown = malloc(sizeof(int)); count = 0; *shutdown = 0; sigemptyset(&sigset); sigaddset(&sigset, signum); sigprocmask (SIG_BLOCK, &sigset, NULL); signal(SIGINT, sighand); signal(SIGTERM, sighand); for(i = 0; i < thread_num ; i++) pthread_create(&pt[i], NULL, simple_loop, NULL); for (i = 0; i < thread_num; i++) pthread_join(pt[i], NULL); exit(0); } Get powertop V2 from git://github.com/fenrus75/powertop, build powertop. After build the above test application, then run it. Test plaform can be Intel Sandybridge or other recent platforms. #./idle_predict -l 10 & #./powertop We will find that deep C-state will dangle between 40%~100% and much time spent on C1 state. It is because menu governor wrongly predict that repeat mode is kept, so it will choose the C1 shallow C-state even though it has chance to sleep 1 second in deep C-state. While after patched the kernel, we find that deep C-state will keep >99.6%. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	12 years ago
John Stultz	d6ad418763	time: Kill xtime_lock, replacing it with jiffies_lock Now that timekeeping is protected by its own locks, rename the xtime_lock to jifffies_lock to better describe what it protects. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	12 years ago
Lars-Peter Clausen	f95a985781	time/jiffies: Make clocksource_jiffies static Commit `f1b8274` ("clocksource: Cleanup clocksource selection") removed all external references to clocksource_jiffies so there is no need to have the symbol globally visible. Fixes the following sparse warning: CHECK kernel/time/jiffies.c kernel/time/jiffies.c:61:20: warning: symbol 'clocksource_jiffies' was not declared. Should it be static? Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	12 years ago
Richard Cochran	65f8f9a1c1	time: remove the timecompare code. This patch removes the timecompare code from the kernel. The top five reasons to do this are: 1. There are no more users of this code. 2. The original idea was a bit weak. 3. The original author has disappeared. 4. The code was not general purpose but tuned to a particular hardware, 5. There are better ways to accomplish clock synchronization. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Acked-by: John Stultz <john.stultz@linaro.org> Tested-by: Bob Liu <lliubbo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	12 years ago
Chuansheng Liu	b8f61116c1	tick: Correct the comments for tick_sched_timer() In the comments of function tick_sched_timer(), the sentence "timer->base->cpu_base->lock held" is not right. In function __run_hrtimer(), before call timer->function(), the cpu_base->lock has been unlocked. Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> Cc: fei.li@intel.com Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1351098455.15558.1421.camel@cliu38-desktop-build Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	12 years ago
Chuansheng Liu	351f181f91	timers, sched: Correct the comments for tick_sched_timer() In the comments of function tick_sched_timer(), the sentence "timer->base->cpu_base->lock held" is not right. In function __run_hrtimer(), before call timer->function(), the cpu_base->lock has been unlocked. Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> Cc: fei.li@intel.com Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1351098455.15558.1421.camel@cliu38-desktop-build Signed-off-by: Ingo Molnar <mingo@kernel.org>	12 years ago
Frederic Weisbecker	94a5714020	tick: Conditionally build nohz specific code in tick handler This optimize a bit the high res tick sched handler. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org>	13 years ago
Frederic Weisbecker	9e8f559b08	tick: Consolidate tick handling for high and low res handlers Besides unifying code, this also adds the idle check before processing idle accounting specifics on the low res handler. This way we also generalize this part of the nohz code for !CONFIG_HIGH_RES_TIMERS to prepare for the adaptive tickless features. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org>	13 years ago
Frederic Weisbecker	5bb962269c	tick: Consolidate timekeeping handling code Unify the duplicated timekeeping handling code of low and high res tick sched handlers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org>	13 years ago
Dan Carpenter	5b3900cd40	timekeeping: Cast raw_interval to u64 to avoid shift overflow We fixed a bunch of integer overflows in timekeeping code during the 3.6 cycle. I did an audit based on that and found this potential overflow. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: John Stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20121009071823.GA19159@elgon.mountain Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org	13 years ago
Frederic Weisbecker	2b17c545a4	nohz: Fix one jiffy count too far in idle cputime When we stop the tick in idle, we save the current jiffies value in ts->idle_jiffies. This snapshot is substracted from the later value of jiffies when the tick is restarted and the resulting delta is accounted as idle cputime. This is how we handle the idle cputime accounting without the tick. But sometimes we need to schedule the next tick to some time in the future instead of completely stopping it. In this case, a tick may happen before we restart the periodic behaviour and from that tick we account one jiffy to idle cputime as usual but we also increment the ts->idle_jiffies snapshot by one so that when we compute the delta to account, we substract the one jiffy we just accounted. To prepare for stopping the tick outside idle, we introduced a check that prevents from fixing up that ts->idle_jiffies if we are not running the idle task. But we use idle_cpu() for that and this is a problem if we run the tick while another CPU remotely enqueues a ttwu to our runqueue: CPU 0: CPU 1: tick_sched_timer() { ttwu_queue_remote() if (idle_cpu(CPU 0)) ts->idle_jiffies++; } Here, idle_cpu() notes that &rq->wake_list is not empty and hence won't consider the CPU as idle. As a result, ts->idle_jiffies won't be incremented. But this is wrong because we actually account the current jiffy to idle cputime. And that jiffy won't get substracted from the nohz time delta. So in the end, this jiffy is accounted twice. Fix this by changing idle_cpu(smp_processor_id()) with is_idle_task(current). This way the jiffy is substracted correctly even if a ttwu operation is enqueued on the CPU. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> # 3.5+ Link: http://lkml.kernel.org/r/1349308004-3482-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	13 years ago
John Stultz	92bb1fcf57	time: Only do nanosecond rounding on GENERIC_TIME_VSYSCALL_OLD systems We only do rounding to the next nanosecond so we don't see minor 1ns inconsistencies in the vsyscall implementations. Since we're changing the vsyscall implementations to avoid this, conditionalize the rounding only to the GENERIC_TIME_VSYSCALL_OLD architectures. Cc: Tony Luck <tony.luck@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	13 years ago
John Stultz	576094b7f0	time: Introduce new GENERIC_TIME_VSYSCALL Now that we moved everyone over to GENERIC_TIME_VSYSCALL_OLD, introduce the new declaration and config option for the new update_vsyscall method. Cc: Tony Luck <tony.luck@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	13 years ago
John Stultz	7063942116	time: Convert CONFIG_GENERIC_TIME_VSYSCALL to CONFIG_GENERIC_TIME_VSYSCALL_OLD To help migrate archtectures over to the new update_vsyscall method, redfine CONFIG_GENERIC_TIME_VSYSCALL as CONFIG_GENERIC_TIME_VSYSCALL_OLD Cc: Tony Luck <tony.luck@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	13 years ago
John Stultz	d7b4202e05	time: Move timekeeper structure to timekeeper_internal.h for vsyscall changes We're going to need to access the timekeeper in update_vsyscall, so make the structure available for those who need it. Cc: Tony Luck <tony.luck@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	13 years ago
John Stultz	b3c869d35b	jiffies: Remove compile time assumptions about CLOCK_TICK_RATE CLOCK_TICK_RATE is used to accurately caclulate exactly how a tick will be at a given HZ. This is useful, because while we'd expect NSEC_PER_SEC/HZ, the underlying hardware will have some granularity limit, so we won't be able to have exactly HZ ticks per second. This slight error can cause timekeeping quality problems when using the jiffies or other jiffies driven clocksources. Thus we currently use compile time CLOCK_TICK_RATE value to generate SHIFTED_HZ and NSEC_PER_JIFFIES, which we then use to adjust the jiffies clocksource to correct this error. Unfortunately though, since CLOCK_TICK_RATE is a compile time value, and the jiffies clocksource is registered very early during boot, there are a number of cases where there are different possible hardware timers that have different tick rates. This causes problems in cases like ARM where there are numerous different types of hardware, each having their own compile-time CLOCK_TICK_RATE, making it hard to accurately support different hardware with a single kernel. For the most part, this doesn't matter all that much, as not too many systems actually utilize the jiffies or jiffies driven clocksource. Usually there are other highres clocksources who's granularity error is negligable. Even so, we have some complicated calcualtions that we do everywhere to handle these edge cases. This patch removes the compile time SHIFTED_HZ value, and introduces a register_refined_jiffies() function. This results in the default jiffies clock as being assumed a perfect HZ freq, and allows archtectures that care about jiffies accuracy to call register_refined_jiffies() with the tick rate, specified dynamically at boot. This allows us, where necessary, to not have a compile time CLOCK_TICK_RATE constant, simplifies the jiffies code, and still provides a way to have an accurate jiffies clock. NOTE: Since this patch does not add register_refinied_jiffies() calls for every arch, it may cause time quality regressions in some cases. Its likely these will not be noticable, but if they are an issue, adding the following to the end of setup_arch() should resolve the regression: register_refinied_jiffies(CLOCK_TICK_RATE) Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	13 years ago
John Stultz	a65bcc12ad	alarmtimer: Rename alarmtimer_remove to alarmtimer_dequeue Now that alarmtimer_remove has been simplified, change its name to _dequeue to better match its paired _enqueue function. Cc: Arve Hjønnevåg <arve@android.com> Cc: Colin Cross <ccross@android.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	13 years ago
John Stultz	dae373be9f	alarmtimer: Use hrtimer per-alarm instead of per-base Arve Hjønnevåg reported numerous crashes from the "BUG_ON(timer->state != HRTIMER_STATE_CALLBACK)" check in __run_hrtimer after it called alarmtimer_fired. It ends up the alarmtimer code was not properly handling possible failures of hrtimer_try_to_cancel, and because these faulres occur when the underlying base hrtimer is being run, this limits the ability to properly handle modifications to any alarmtimers on that base. Because much of the logic duplicates the hrtimer logic, it seems that we might as well have a per-alarmtimer hrtimer, and avoid the extra complextity of trying to multiplex many alarmtimers off of one hrtimer. Thus this patch moves the hrtimer to the alarm structure and simplifies the management logic. Changelog: v2: * Includes a fix for double alarm_start calls found by Arve Cc: Arve Hjønnevåg <arve@android.com> Cc: Colin Cross <ccross@android.com> Cc: Thomas Gleixner <tglx@linutronix.de> Reported-by: Arve Hjønnevåg <arve@android.com> Tested-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	13 years ago
Todd Poynor	59a93c27c4	alarmtimer: Implement minimum alarm interval for allowing suspend alarmtimer suspend return -EBUSY if the next alarm will fire in less than 2 seconds. This allows one RTC seconds tick to occur subsequent to this check before the alarm wakeup time is set, ensuring the wakeup time is still in the future (assuming the RTC does not tick one more second prior to setting the alarm). If suspend is rejected due to an imminent alarm, hold a wakeup source for 2 seconds to process the alarm prior to reattempting suspend. If setting the alarm incurs an -ETIME for an alarm set in the past, or any other problem setting the alarm, abort suspend and hold a wakelock for 1 second while the alarm is allowed to be serviced or other hopefully transient conditions preventing the alarm clear up. Signed-off-by: Todd Poynor <toddpoynor@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	13 years ago
Paul E. McKenney	803b0ebae9	time: RCU permitted to stop idle entry via softirq The can_stop_idle_tick() function complains if a softirq vector is raised too late in the idle-entry process, presumably in order to prevent dangling softirq invocations from being delayed across the full idle period, which might be indefinitely long -- and if softirq was asserted any later than the call to this function, such a delay might well happen. However, RCU needs to be able to use softirq to stop idle entry in order to be able to drain RCU callbacks from the current CPU, which in turn enables faster entry into dyntick-idle mode, which in turn reduces power consumption. Because RCU takes this action at a well-defined point in the idle-entry path, it is safe for RCU to take this approach. This commit therefore silences the error message that is sometimes produced when the going-idle CPU suddenly finds that it has an RCU_SOFTIRQ to process. The error message will continue to be issued for other softirq vectors. Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	13 years ago
John Stultz	ec145babe7	time: Fix timeekeping_get_ns overflow on 32bit systems Daniel Lezcano reported seeing multi-second stalls from keyboard input on his T61 laptop when NOHZ and CPU_IDLE were enabled on a 32bit kernel. He bisected the problem down to commit `1e75fa8be9` ("time: Condense timekeeper.xtime into xtime_sec"). After reproducing this issue, I narrowed the problem down to the fact that timekeeping_get_ns() returns a 64bit nsec value that hasn't been accumulated. In some cases this value was being then stored in timespec.tv_nsec (which is a long). On 32bit systems, with idle times larger then 4 seconds (or less, depending on the value of xtime_nsec), the returned nsec value would overflow 32bits. This limited kept time from increasing, causing timers to not expire. The fix is to make sure we don't directly store the result of timekeeping_get_ns() into a tv_nsec field, instead using a 64bit nsec value which can then be added into the timespec via timespec_add_ns(). Reported-and-bisected-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Link: http://lkml.kernel.org/r/1347405963-35715-1-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	13 years ago
Alex Shi	c1cc017c59	sched/nohz: Clean up select_nohz_load_balancer() There is no load_balancer to be selected now. It just sets the state of the nohz tick to stop. So rename the function, pass the 'cpu' as a parameter and then remove the useless call from tick_nohz_restart_sched_tick(). [ s/set_nohz_tick_stopped/nohz_balance_enter_idle/g s/clear_nohz_tick_stopped/nohz_balance_exit_idle/g ] Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venki@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1347261059-24747-1-git-send-email-alex.shi@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	13 years ago
Charles Wang	749c8814f0	sched: Add missing call to calc_load_exit_idle() Azat Khuzhin reported high loadavg in Linux v3.6 After checking the upstream scheduler code, I found Peter's commit: `5167e8d541` sched/nohz: Rewrite and fix load-avg computation -- again not fully applied, missing the call to calc_load_exit_idle(). After that idle exit in sampling window will always be calculated to non-idle, and the load will be higher than normal. This patch adds the missing call to calc_load_exit_idle(). Signed-off-by: Charles Wang <muming.wq@taobao.com> Cc: stable@kernel.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1345449754-27130-1-git-send-email-muming.wq@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	13 years ago
Rafael J. Wysocki	adc78e6b99	timekeeping: Add suspend and resume of clock event devices Some clock event devices, for example such that belong to PM domains, need to be handled in a spcial way during the timekeeping suspend and resume (which takes place in the system core, or "syscore", stages of system power transitions) in analogy with clock sources. Introduce .suspend() and .resume() callbacks for clock event devices that will be executed by timekeeping_suspend/_resume(), respectively, next the the clock sources' .suspend() and .resume() callbacks. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	13 years ago
John Stultz	cee58483cf	time: Move ktime_t overflow checking into timespec_valid_strict Andreas Bombe reported that the added ktime_t overflow checking added to timespec_valid in commit `4e8b14526c` ("time: Improve sanity checking of timekeeping inputs") was causing problems with X.org because it caused timeouts larger then KTIME_T to be invalid. Previously, these large timeouts would be clamped to KTIME_MAX and would never expire, which is valid. This patch splits the ktime_t overflow checking into a new timespec_valid_strict function, and converts the timekeeping codes internal checking to use this more strict function. Reported-and-tested-by: Andreas Bombe <aeb@debian.org> Cc: Zhouping Liu <zliu@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	13 years ago
John Stultz	bf2ac31219	time: Avoid making adjustments if we haven't accumulated anything If update_wall_time() is called and the current offset isn't large enough to accumulate, avoid re-calling timekeeping_adjust which may change the clock freq and can cause 1ns inconsistencies with CLOCK_REALTIME_COARSE/CLOCK_MONOTONIC_COARSE. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1345595449-34965-5-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	13 years ago
John Stultz	6ea565a9be	time: Avoid potential shift overflow with large shift values Andreas Schwab noticed that the 1 << tk->shift could overflow if the shift value was greater than 30, since 1 would be a 32bit long on 32bit architectures. This issue was introduced by `1e75fa8be` (time: Condense timekeeper.xtime into xtime_sec) Use 1ULL instead to ensure we don't overflow on the shift. Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1345595449-34965-4-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	13 years ago
Andreas Schwab	85dc8f05c9	time: Fix casting issue in timekeeping_forward_now arch_gettimeoffset returns a u32 value which when shifted by tk->shift can overflow. This issue was introduced with `1e75fa8be` (time: Condense timekeeper.xtime into xtime_sec) Cast it to u64 first. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1345595449-34965-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	13 years ago

1 2 3 4 5 ...

797 Commits (c0e811d9f5d1ee708f06c4f0a1009f8a1d22f364)