If CONFIG_FTRACE is selected and /proc/sys/kernel/ftrace_enabled is
set to a non-zero value the ftrace routine will be called everytime
we enter a kernel function that is not marked with the "notrace"
attribute.
The ftrace routine will then call a registered function if a function
happens to be registered.
[ This code has been highly hacked by Steven Rostedt and Ingo Molnar,
so don't blame Arnaldo for all of this ;-) ]
Update:
It is now possible to register more than one ftrace function.
If only one ftrace function is registered, that will be the
function that ftrace calls directly. If more than one function
is registered, then ftrace will call a function that will loop
through the functions to call.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
notrace signals that a function should not be traced. Most of the
time this is used by tracers to annotate code that cannot be
traced - it's in a volatile state (such as in user vdso context
or NMI context) or it's in the tracer internals.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The tracer may need to call preempt_enable and disable functions
for time keeping and such. The trace gets ugly when we see these
functions show up for all traces. To make the output cleaner
this patch adds preempt_enable_notrace and preempt_disable_notrace
to be used by tracer (and debugging) functions.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The tracer wants to be able to convert the state number
into a user visible character. This patch pulls that conversion
string out the scheduler into the header. This way if it were to
ever change, other parts of the kernel will know.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
add 3 lightweight callbacks to the tracer backend.
zero impact if tracing is turned off.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If previous window was above representable values of u16,
strange things will happen if undo with the truncated value
is called for. Alternatively, this could be fixed by some
max trickery but that would limit undoing high-speed undos.
Adds 16-bit hole but there isn't anything to fill it with.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Steuck <greg@nest.cx> points out that some of the netfilter
headers can't be used in userspace without including linux/types.h
first. The headers include their own linux/types.h include statements,
these are stripped by make headers-install because they are inside
#ifdef __KERNEL__ however. Move them out to fix this.
Reported and Tested by Greg Steuck.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Propagate feature bits from the NETDEV_FEAT_CHANGE notifier. For now
only TSO is propagated for devices that announce their ability to
support TSO in combination with VLAN accel by setting the NETIF_F_VLAN_TSO
flag.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
We want to have the drvdata field set properly when creating the device
as sysfs callbacks can assume it is present and it can race the later
setting of this field.
So, create two new functions, deviec_create_vargs() and
device_create_drvdata() that take this new field.
device_create_drvdata() will go away in 2.6.27 as the drvdata field will
just be moved to the device_create() call as it should be.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch removes CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Since 2.6.25 the HID_QUIRK_APPLE_HAS_FN quirk is enabled even for
non-laptop Apple keyboards of the Aluminium series. The USB version of
these don't need Numlock emulation, like the laptop (and Aluminium
Wireless) do, as they have a proper keypad.
This patch splits the Numlock emulation for Apple keyboards in a
different quirk flag, so that it can be enabled for all the keyboards
but the Aluminium USB ones.
If the Numlock emulation is enabled for Aluminium USB keyboards, the
JKL and UIO keys become the numeric pad, and the rest of the keyboard
is disabled, included the key used to disable Numlock.
Additionally, these keyboard should not have a Numlock at all, as the
Numlock key is instead replaced by the 'Clear' key as usual for Apple
USB keyboards.
Signed-off-by: Diego 'Flameeyes' Petteno <flameeyes@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
I was hoping ATA_HORKAGE_NODMA | ATA_HORKAGE_SKIP_PM could keep it
happy but no even this doesn't work under certain configurations and
it's not like we can do anything useful with the cofig device anyway.
Replace ATA_HORKAGE_SKIP_PM with ATA_HORKAGE_DISABLE and use it for
the config device. This makes the device completely ignored by
libata.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This timeout was set low because previously PMP register access was
done via polling and register access timeouts could stack up. This is
no longer the case. One timeout will make all following accesses fail
immediately.
In rare cases both marvell and SIMG PMPs need almost a second. Bump
it to 3s.
While at it, rename it to SATA_PMP_RW_TIMEOUT. It's not specific to
SCR access.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Use the kernel-provided clamp_val() macro.
FIT was always applied to a member of struct ata_timing (unsigned short)
and two constants. clamp_val will not cast to short anymore.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Given that <linux/in6.h> contains a __KERNEL__ test, it should be
unifdef-ed.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Given that <linux/dlm_plock.h> contains a conditional __KERNEL__ test,
it should be moved from header-y to unifdef-y.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David Teigland <teigland@redhat.com>
The svc_rdma_send_error function is called when an RPCRDMA protocol
error is detected. This function attempts to post an error reply message.
Since an error posting to a transport in error is ignored, change
the return type to void.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Some providers may wait while destroying adapter resources.
Since it is possible that the last reference is put on the
dto_tasklet, the actual destroy must be scheduled as a work item.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Replace the one-off linked list implementation used to implement the
context cache with the standard Linux list_head lists. Add a context
counter to catch resource leaks. A WARN_ON will be added later to
ensure that we've freed all contexts.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
An NFS_WRITE requires a set of RDMA_READ requests to fetch the write
data from the client. There are two principal pieces of data that
need to be tracked: the list of pages that comprise the completed RPC
and the SGE of dma mapped pages to refer to this list of pages. Previously
this whole bit was managed as a linked list of contexts with the
context containing the page list buried in this list. This patch
simplifies this processing by not keeping a linked list, but rather only
a pionter from the last submitted RDMA_READ's context to the context
that maps the set of pages that describe the RPC. This significantly
simplifies this code path. SGE contexts are cleaned up inline in the DTO
path instead of at read completion time.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Remove the old driver_name/type scheme for i2c driver matching. Only the
standard aliasing model will be used from now on.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Even though copy_compat_strings() doesn't cache the pages,
copy_strings_kernel() and stuff indirectly called by e.g.
->load_binary() is doing that, so we need to drop the
cache contents in the end.
[found by WANG Cong <wangcong@zeuux.org>]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Everybody wants to pass it a function pointer, and in fact, that is what
you _must_ pass it for it to make sense (since it knows that ia64 and
ppc64 use descriptors for function pointers and fetches the actual
address from there).
So don't make the argument be a 'unsigned long' and force everybody to
add a cast.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The atm_tcp.h uses types from linux/atm.h, but does not include it.
It should also use the standard __u## types from linux/types.h rather
than the uint##_t types since the former can be found with the kernel
already.
Same goes for linux/atm.h. The linux/socket.h include there also gets
dropped as atm.h does not actually use anything from socket.h.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current module loader lookups ".data.percpu" ELF section to perform
per_cpu relocation. But DEFINE_PER_CPU_SHARED_ALIGNED() uses another
section (".data.percpu.shared_aligned"), currently only handled in
vmlinux.lds, not by module loader.
To correct this problem, instead of adding logic into module loader, or
using at build time a module.lds file for all arches to group
".data.percpu.shared_aligned" into ".data.percpu", just use ".data.percpu"
for modules.
Alignment requirements are correctly handled by ld and module loader.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a common hex array in hexdump.c so everyone can use it.
Add a common hi/lo helper to avoid the shifting masking that is
done to get the upper and lower nibbles of a byte value.
Pull the pack_hex_byte helper from kgdb as it is opencoded many
places in the tree that will be consolidated.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
match_strcpy() is a somewhat creepy function: the caller needs to make sure
that the destination buffer is big enough, and when he screws up or
forgets, match_strcpy() happily overruns the buffer.
There's exactly one customer: v9fs_parse_options(). I believe it currently
can't overflow its buffer, but that's not exactly obvious.
The source string is a substing of the mount options. The kernel silently
truncates those to PAGE_SIZE bytes, including the terminating zero. See
compat_sys_mount() and do_mount().
The destination buffer is obtained from __getname(), which allocates from
name_cachep, which is initialized by vfs_caches_init() for size PATH_MAX.
We're safe as long as PATH_MAX <= PAGE_SIZE. PATH_MAX is 4096. As far as
I know, the smallest PAGE_SIZE is also 4096.
Here's a patch that makes the code a bit more obviously correct. It
doesn't depend on PATH_MAX <= PAGE_SIZE.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Jim Meyering <meyering@redhat.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Disable Virtual DMA support for now (it causes system hangs).
Thanks to TAKADA Yoshihito for the help with debugging the problem.
Reported-by: TAKADA Yoshihito <takada@mbf.nifty.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
SELECT_MASK() can now become static.
[bart: remove space between function name and open parenthesis]
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
because of the class_device was removed, now do the children list removing
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Some devices, like md, may create partitions only at first access,
so allow root= to be set to a valid non-existant partition of an
existing disk. This applies only to non-initramfs root mounting.
This fixes a regression from 2.6.24 which did allow this to happen and
broke some users machines :(
Acked-by: Neil Brown <neilb@suse.de>
Tested-by: Joao Luis Meloni Assirati <assirati@nonada.if.usp.br>
Cc: stable <stable@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This will be used by the wireless usb code, as well as potentially other
USB code.
Originally based on some .c code written by Inaky Perez-Gonzalez
<inaky.perez-gonzalez@intel.com>
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Prior to 2.6.26 fuse only supported single page write requests. In theory all
fuse filesystem should be able support bigger than 4k writes, as there's
nothing in the API to prevent it. Unfortunately there's a known case in
NTFS-3G where big writes cause filesystem corruption. There could also be
other filesystems, where the lack of testing with big write requests would
result in bugs.
To prevent such problems on a kernel upgrade, disable big writes by default,
but let filesystems set a flag to turn it on.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Szabolcs Szakacsits <szaka@ntfs-3g.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When mm destruction happens, we should pass mm_update_next_owner() the old mm.
But unfortunately new mm is passed in exec_mmap().
Thus, kernel panic is possible when a multi-threaded process uses exec().
Also, the owner member comment description is wrong. mm->owner does not
necessarily point to the thread group leader.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: "Paul Menage" <menage@google.com>
Cc: "KAMEZAWA Hiroyuki" <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
They aren't used. They were briefly used as part of some other patches to
provide an alternative format for displaying some /proc and /sys cpumasks.
They probably should have been removed when those other patches were dropped,
in favor of a different solution.
Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: "Mike Travis" <travis@sgi.com>
Cc: "Bert Wesarg" <bert.wesarg@googlemail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds needed_headroom/needed_tailroom members to struct
net_device and updates many places that allocate sbks to use them. Not
all of them can be converted though, and I'm sure I missed some (I
mostly grepped for LL_RESERVED_SPACE)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wireless networking, particularly with MESH enabled, has
quite strong requirements for link-layer header space.
Based upon some numbers and descriptions from Johannes Berg
we use 96 (same as AX25) for plain wireless, and with
mesh enabled we use 128.
In the process, simplify the cpp conditional logic here by
ordering the cases by those needing the most space down
to those needing the least case.
Signed-off-by: David S. Miller <davem@davemloft.net>
Because it's not correct with a non-preemptable BKL and just causes
PREEMPT kernels to have longer latencies than non-PREEMPT ones (which is
obviously not the point of it at all).
Of course, that config option actually got removed as an option earlier,
so for now this basically disables it entirely, but if BKL preemption is
ever resurrected it will be a meaningful optimization. And in the
meantime, it at least documents the intent of the code, while not doing
the wrong thing.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We should not go through the task pointer to get at the thread info,
since it's usually cheaper to just access the thread info directly.
So don't make the code look up 'current', when we can just use the
thread info accessor functions directly. This generally avoids one
level of indirection and tends to work better together with code that
also looks at other thread flags (eg preempt_count).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The syncppp layer wants a mid-level netdev private pointer.
It was using netdev->priv but that only worked by accident,
and thus this scheme was broken when the device private
allocation strategy changed.
Add a proper mid-layer private pointer for uses like this,
update syncppp and all users, and remove the HDLC_PPP broken
tag from drivers/net/wan/Kconfig
Signed-off-by: David S. Miller <davem@davemloft.net>
It acts exactly like a regular 'cond_resched()', but will not get
optimized away when CONFIG_PREEMPT is set.
Normal kernel code is already preemptable in the presense of
CONFIG_PREEMPT, so cond_resched() is optimized away (see commit
02b67cc3ba "sched: do not do
cond_resched() when CONFIG_PREEMPT").
But when wanting to conditionally reschedule while holding a lock, you
need to use "cond_sched_lock(lock)", and the new function is the BKL
equivalent of that.
Also make fs/locks.c use it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As the old driver_name/type matching scheme is going away soon, change
the dummy device mechanism to use the new matching scheme.
This has the downside that dummy i2c clients can no longer choose
their name, they'll all appear as "dummy" in sysfs and in log
messages. I don't think it is a problem in practice though, as there
is little reason to use these i2c clients to log messages.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The generic semaphore rewrite had a huge performance regression on AIM7
(and potentially other BKL-heavy benchmarks) because the generic
semaphores had been rewritten to be simple to understand and fair. The
latter, in particular, turns a semaphore-based BKL implementation into a
mess of scheduling.
The attempt to fix the performance regression failed miserably (see the
previous commit 00b41ec261 'Revert
"semaphore: fix"'), and so for now the simple and sane approach is to
instead just go back to the old spinlock-based BKL implementation that
never had any issues like this.
This patch also has the advantage of being reported to fix the
regression completely according to Yanmin Zhang, unlike the semaphore
hack which still left a couple percentage point regression.
As a spinlock, the BKL obviously has the potential to be a latency
issue, but it's not really any different from any other spinlock in that
respect. We do want to get rid of the BKL asap, but that has been the
plan for several years.
These days, the biggest users are in the tty layer (open/release in
particular) and Alan holds out some hope:
"tty release is probably a few months away from getting cured - I'm
afraid it will almost certainly be the very last user of the BKL in
tty to get fixed as it depends on everything else being sanely locked."
so while we're not there yet, we do have a plan of action.
Tested-by: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Alexander Viro <viro@ftp.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It actually makes much more sense there, and we do tend to need it for
non-RCU usage too. Moving it to <linux/compiler.h> will allow some
other cases that have open-coded the same logic to use the same helper
function that RCU has used.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>