Currently, although find_last_bit is EXPORTed, it is statically linked
with the kernel and is referenced only under CONFIG_SMP.
When CONFIG_SMP is undefined and find_last_bit is referenced only by
modules, linking fails with:
ERROR: "find_last_bit" [fs/nfs/nfs.ko] undefined!
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Always omit frame pointers on s390. They aren't too useful for the
kernel since we have already the kernel stack backchain which allows
us to walk the kernel stack.
So eleminate the extra code for frame pointers. Only allow the extra
code for the function tracer since the gcc compile options -pg and
-fomit-frame-pointer are incompatible.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Impact: fix not-so-critical but annoying bug
sg_miter_next() returns 0 sized mapping if there is an zero sized sg
entry in the list or at the end of each iteration. As the users
always check the ->length field, this bug shouldn't be critical other
than causing unnecessary iteration.
Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
notice one system /proc/iomem some entries missed the name for pci_devices
it turns that dev->dev.kobj name is changed after device_add.
for pci code: via acpi_pci_root_driver.ops.add (aka acpi_pci_root_add)
==> pci_acpi_scan_root is used to scan pci bus/device, and at the same
time we read the resource for pci_dev in the pci_read_bases, we have
res->name = pci_name(pci_dev); pci_name is calling dev_name.
later via acpi_pci_root_driver.ops.start (aka acpi_pci_root_start) ==>
pci_bus_add_device to add all pci_dev in kobj tree. pci_bus_add_device
will call device_add.
actually in device_add
/* first, register with generic layer. */
error = kobject_add(&dev->kobj, dev->kobj.parent, "%s", dev_name(dev));
if (error)
goto Error;
will get one new name for that kobj, old name is freed.
[Impact: fix corrupted names in /proc/iomem ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Users can read sysfs files, there is no reason they should not be
allowed to listen to uevents. This lets xorg and other userspace
programs properly get these messages without having to be root.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This reverts commit f520360d93.
Tetsuo Handa, running a kernel with CONFIG_DEBUG_PAGEALLOC=y and
CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug, has been hitting RCU detected
CPU stalls: it's been spinning in the loop where do_execve() counts up
the args (but why wasn't fixup_exception working? dunno).
The recent change, switching kobject_uevent_env() from UMH_WAIT_EXEC
to UMH_NO_WAIT, is broken: the exec uses args on the local stack here,
and an env which is kfreed as soon as call_usermodehelper() returns.
It very much needs to wait for the exec to be done.
An alternative would be to keep the UMH_NO_WAIT, and complicate the code
to allocate and free these resources correctly? but no, as GregKH
pointed out when making the commit, CONFIG_UEVENT_HELPER_PATH="" is a
much better optimization - though some distros are still saying
/sbin/hotplug in their .config, yet with no such binary in their initrd
or their root.
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Will Newton <will.newton@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: provide useful missing info for developers
Kernel taint can occur in several situations such as warnings,
load of prorietary or staging modules, bad page, etc...
But when such taint happens, a developer might still be working on
the kernel, expecting that lockdep is still enabled. But a taint
disables lockdep without ever warning about it.
Such a kernel behaviour doesn't really help for kernel development.
This patch adds this missing warning.
Since the taint is done most of the time after the main message that
explain the real source issue, it seems safe to warn about it inside
add_taint() so that it appears at last, without hurting the main
information.
v2: Use a generic helper to disable lockdep instead of an
open coded xchg().
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1239412638-6739-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't offer a default-y option when the user has turned off
CONFIG_DETECT_SOFTLOCKUP already.
Do offer it as 'y' only if DETECT_SOFTLOCKUP is on already.
This makes it match previous behavior - where the hung-task check was
embedded i CONFIG_DETECT_SOFTLOCKUP code.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
lib/decompress_unlzma.c depends on slab.h without including it:
CC lib/decompress_unlzma.o
lib/decompress_unlzma.c: In function ‘rc_free’:
lib/decompress_unlzma.c:122: error: implicit declaration of function ‘kfree’
lib/decompress_unlzma.c: In function ‘unlzma’:
lib/decompress_unlzma.c:551: error: implicit declaration of function ‘kmalloc’
lib/decompress_unlzma.c:551: warning: assignment makes pointer from integer without a cast
make[1]: *** [lib/decompress_unlzma.o] Error 1
make: *** [lib/] Error 2
It gets included implicitly currently - but this will not be the
case with upcoming kmemtrace changes.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <1237886521.25315.58.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
lib/decompress_bunzip2.c depends on slab.h without including it:
CC lib/decompress_bunzip2.o
lib/decompress_bunzip2.c: In function ‘start_bunzip’:
lib/decompress_bunzip2.c:636: error: implicit declaration of function ‘kmalloc’
lib/decompress_bunzip2.c:636: warning: assignment makes pointer from integer without a cast
lib/decompress_bunzip2.c: In function ‘bunzip2’:
lib/decompress_bunzip2.c:682: warning: assignment makes pointer from integer without a cast
lib/decompress_bunzip2.c:693: warning: assignment makes pointer from integer without a cast
lib/decompress_bunzip2.c:726: error: implicit declaration of function ‘kfree’
make[1]: *** [lib/decompress_bunzip2.o] Error 1
make: *** [lib/] Error 2
It gets included implicitly currently - but this will not be the
case with upcoming kmemtrace changes.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <1237886032.25315.48.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix build
lib/decompress_inflate.c depends on slab.h without including it:
CC lib/decompress_inflate.o
lib/decompress_inflate.c: In function ‘gunzip’:
lib/decompress_inflate.c:45: error: implicit declaration of function ‘kmalloc’
lib/decompress_inflate.c:45: warning: assignment makes pointer from integer without a cast
lib/decompress_inflate.c:57: warning: assignment makes pointer from integer without a cast
lib/decompress_inflate.c:65: warning: assignment makes pointer from integer without a cast
lib/decompress_inflate.c:71: warning: assignment makes pointer from integer without a cast
lib/decompress_inflate.c:154: error: implicit declaration of function ‘kfree’
make[1]: *** [lib/decompress_inflate.o] Error 1
make: *** [lib/] Error 2
It gets included implicitly currently - but this will not be the
case with upcoming kmemtrace changes.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <1237886030.25315.47.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix slab corruption caused by alloc_cpumask_var_node() overwriting the
tail end of an off-stack cpumask.
The function zeros out cpumask bits beyond the last possible cpu. The
starting point for zeroing should be the beginning of the mask offset by a
byte count derived from the number of possible cpus. The offset was
calculated in bits instead of bytes. This resulted in overwriting the end
of the cpumask.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Mike Travis <travis.sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: <stable@kernel.org> [2.6.29.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch for Per-CSS(Cgroup Subsys State) ID and private hierarchy code.
This patch attaches unique ID to each css and provides following.
- css_lookup(subsys, id)
returns pointer to struct cgroup_subysys_state of id.
- css_get_next(subsys, id, rootid, depth, foundid)
returns the next css under "root" by scanning
When cgroup_subsys->use_id is set, an id for css is maintained.
The cgroup framework only parepares
- css_id of root css for subsys
- id is automatically attached at creation of css.
- id is *not* freed automatically. Because the cgroup framework
don't know lifetime of cgroup_subsys_state.
free_css_id() function is provided. This must be called by subsys.
There are several reasons to develop this.
- Saving space .... For example, memcg's swap_cgroup is array of
pointers to cgroup. But it is not necessary to be very fast.
By replacing pointers(8bytes per ent) to ID (2byes per ent), we can
reduce much amount of memory usage.
- Scanning without lock.
CSS_ID provides "scan id under this ROOT" function. By this, scanning
css under root can be written without locks.
ex)
do {
rcu_read_lock();
next = cgroup_get_next(subsys, id, root, &found);
/* check sanity of next here */
css_tryget();
rcu_read_unlock();
id = found + 1
} while(...)
Characteristics:
- Each css has unique ID under subsys.
- Lifetime of ID is controlled by subsys.
- css ID contains "ID" and "Depth in hierarchy" and stack of hierarchy
- Allowed ID is 1-65535, ID 0 is UNUSED ID.
Design Choices:
- scan-by-ID v.s. scan-by-tree-walk.
As /proc's pid scan does, scan-by-ID is robust when scanning is done
by following kind of routine.
scan -> rest a while(release a lock) -> conitunue from interrupted
memcg's hierarchical reclaim does this.
- When subsys->use_id is set, # of css in the system is limited to
65535.
[bharata@linux.vnet.ibm.com: remove rcu_read_lock() from css_get_next()]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tfour 4 redundant if-conditions in function __rb_erase_color() in
lib/rbtree.c are removed.
In pseudo-source-code, the structure of the code is as follows:
if ((!A || B) && (!C || D)) {
.
.
.
} else {
if (!C || D) {//if this is true, it implies: (A == true) && (B == false)
if (A) {//hence this always evaluates to 'true'...
.
}
.
//at this point, C always becomes true, because of:
__rb_rotate_right/left();
//and:
other = parent->rb_right/left;
}
.
.
if (C) {//...and this too !
.
}
}
Signed-off-by: Wolfram Strepp <wstrepp@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_DEBUG_PAGEALLOC is now supported by x86, powerpc, sparc64, and
s390. This patch implements it for the rest of the architectures by
filling the pages with poison byte patterns after free_pages() and
verifying the poison patterns before alloc_pages().
This generic one cannot detect invalid page accesses immediately but
invalid read access may cause invalid dereference by poisoned memory and
invalid write access can be detected after a long delay.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix printk format warnings in dma-debug:
lib/dma-debug.c:645: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
lib/dma-debug.c:662: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
lib/dma-debug.c:676: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
lib/dma-debug.c:686: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow simple quoting of words in the dynamic debug control language.
This allows more natural specification when using the control language
to match against printk formats, e.g
#echo -n 'format "Setting node for non-present cpu" +p' >
/mnt/debugfs/dynamic_debug/control
instead of
#echo -n 'format Setting\040node\040for\040non-present\040cpu +p' >
/mnt/debugfs/dynamic_debug/control
Adjust the dynamic debug documention to describe that and provide a
new example. Adjust the existing examples in the documentation to
reflect the current whitespace escaping behaviour when reading the
control file. Fix some minor documentation trailing whitespace.
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch combines Greg Bank's dprintk() work with the existing dynamic
printk patchset, we are now calling it 'dynamic debug'.
The new feature of this patchset is a richer /debugfs control file interface,
(an example output from my system is at the bottom), which allows fined grained
control over the the debug output. The output can be controlled by function,
file, module, format string, and line number.
for example, enabled all debug messages in module 'nf_conntrack':
echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control
to disable them:
echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control
A further explanation can be found in the documentation patch.
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Right now, the kobject_uevent code blocks for each uevent that's being
generated, due to using (for hystoric reasons) UHM_WAIT_EXEC as flag to
call_usermode_helper(). Specifically, the effect is that each uevent
that is being sent causes the code to wake up keventd, then block until
keventd has processed the work. Needless to say, this happens many times
during the system boot.
This patches changes that to UHN_NO_WAIT (brilliant name for a constant
btw) so that we only schedule the work to fire the uevent message, but
do not wait for keventd to process the work.
This removes one of the bottlenecks during boot; each one of them is
only a small effect, but the sum of them does add up.
[Note, distros that need this are broken, they should be setting
CONFIG_UEVENT_HELPER_PATH to "", that way this code path will never be
excuted at all -- gregkh]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch implements uevent suppress in kobject and removes it
from struct device, based on the following ideas:
1,Uevent sending should be one attribute of kobject, so suppressing it
in kobject layer is more natural than in device layer. By this way,
we can do it for other objects embedded with kobject.
2,It may save several bytes for each instance of struct device.(On my
omap3(32bit ARM) based box, can save 8bytes per device object)
This patch also introduces dev_set|get_uevent_suppress() helpers to
set and query uevent_suppress attribute in case to help kobject
as private part of struct device in future.
[This version is against the latest driver-core patch set of Greg,please
ignore the last version.]
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that all users of bus_id is gone, we can remove it from struct
device.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Impact: extend on-kernel-stack DMA debug checks to all !highmem pages
We only checked dma_map_single() - extend it to dma_map_page()
and dma_map_sg() as well.
Also, fix dma_map_single() corner case bug: make sure we dont
stack-check highmem (not mapped) pages.
Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
LKML-Reference: <1237818908-26516-1-git-send-email-joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: allow architectures to monitor busses for dma mem leakage
This patch adds checking code to detect if a device has pending DMA
operations when it is about to be unbound from its device driver.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Impact: get notified if a device dma maps illegal areas
This patch adds a check to print a warning message when a device driver
tries to map a memory area from the kernel text segment or rodata.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Impact: saves stacktrace of a dma mapping and prints it if there is an error
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This adds a function to dump the DMA mappings that the debugging code is
aware of -- either for a single device, or for _all_ devices.
This can be useful for debugging -- sticking a call to it in the DMA
page fault handler, for example, to see if the faulting address _should_
be mapped or not, and hence work out whether it's IOMMU bugs we're
seeing, or driver bugs.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Impact: avoid recursive kfree calls, less slab activity on heavy load
debugobjects checks on kfree whether tracked objects are freed. When a
tracked object is freed debugobjects frees the internal reference
object as well. The debug object slab cache is marked to not recurse
into debugobjects when a slab objects is freed, but the recursive call
can be problematic versus locking in the memory allocator.
Defer the freeing of debug slab objects via schedule_work. The reasons
not to use RCU are:
1) rcu makes the data structure larger
2) there is no real need for rcu as nothing references the obj after
we freed it
3) under heavy load it is easier to reuse the to be freed objects instead
of allocating new objects from the slab. This lowered the slab activity
significantly in a heavy load networking test where lots of timers are
created/destroyed. The workqueue based delayed free allows us just to
put the to be freed objects back into the object pool and reuse them
right away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <200903162049.58058.nickpiggin@yahoo.com.au>
Impact: refactor/consolidate object management, prepare for delayed free
debugobjects allocates static reference objects to track objects which
are initialized or activated before the slab cache becomes
available. These static reference objects have to be handled
seperately in free_object(). The handling of these objects is in the
way of implementing a delayed free functionality. The delayed free is
required to avoid callbacks into the mm code from
debug_check_no_obj_freed().
Replace the static object references with dynamic ones after the slab
cache has been initialized. The static objects are now marked initdata.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <200903162049.58058.nickpiggin@yahoo.com.au>
Jeremy Fitzhardinge reported:
> Change fef20d9c13, "vsprintf:
> unify the format decoding layer for its 3 users", causes a
> regression in xenbus which results in no devices getting
> attached to a new domain.
%.*s is broken - fix it.
Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Guennadi Liakhovetski noticed that the end condition for the loop in
bitmap_find_free_region() is wrong, and the "return if error" was also
using the wrong conditional that would only trigger if the bitmap was an
exact multiple of the allocation size, which is not necessarily the case
with dma_alloc_from_coherent().
Such a failure would end up in bitmap_find_free_region() accessing
beyond the end of the bitmap.
Reported-by: Guennadi Liakhovetski <lg@denx.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: cleanup
The naming clashes with upcoming softirq tracepoints, so rename the
APIs to lockdep_*().
Requested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We moved the netlink attribute support from net to lib in order
for it to be available for general consumption. However, parts
of the code (the bits that we don't need :) really depends on
NET because the target object is sk_buff.
This patch fixes this by wrapping them in CONFIG_NET.
Some EXPORTs have been moved to make this work.
Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix a problem in the IDR system, where an idr_remove_all() hands a data
element to call_rcu() (via free_layer()) before making that data element
inaccessible to new readers. This is very bad, and results in readers
still having a reference to this data element at the end of the grace
period.
Tests on large machines that concurrently map and unmap user-space memory
within the same multithreaded process result in crashes within about five
minutes. Applying this patch increases the kernel's longevity to the
three-to-eight-hour range.
There appear to be other similar problems in idr_get_empty_slot() and
sub_remove(), but I fixed the easy one in idr_remove_all() first. It is
therefore no surprise that failures still occur.
Located-by: Milton Miller II <miltonm@austin.ibm.com>
Tested-by: Milton Miller II <miltonm@austin.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sitsofe Wheeler found and bisected that while unifying the
vsprintf format decoding in:
fef20d9: vsprintf: unify the format decoding layer for its 3 users
The sign flag has been dropped out in favour of
precise types (ie: LONG/ULONG).
But the format helper number() still needs this flag to keep track of
the signedness unless it will consider all numbers as unsigned.
Also add an explicit cast to int (for %d) while parsing with va_arg()
to ensure the highest bit is well extended on the 64 bits number that
hosts the value in case of negative values.
Reported-Bisected-Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20090309201503.GA5010@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
An new optimization is making its way to ftrace. Its purpose is to
make trace_printk() consuming less memory and be faster.
Written by Lai Jiangshan, the approach is to delay the formatting
job from tracing time to output time.
Currently, a call to trace_printk() will format the whole string and
insert it into the ring buffer. Then you can read it on /debug/tracing/trace
file.
The new implementation stores the address of the format string and
the binary parameters into the ring buffer, making the packet more compact
and faster to insert.
Later, when the user exports the traces, the format string is retrieved
with the binary parameters and the formatting job is eventually done.
The new implementation rewrites a lot of format decoding bits from
vsnprintf() function, making now 3 differents functions to maintain
in their duplicated parts of printf format decoding bits.
Suggested by Ingo Molnar, this patch tries to factorize the most
possible common bits from these functions.
The real common part between them is the format decoding. Although
they do somewhat similar jobs, their way to export or import the parameters
is very different. Thus, only the decoding layer is extracted, unless you see
other parts that could be worth factorized.
Changes in V2:
- Address a suggestion from Linus to group the format_decode() parameters inside
a structure.
Changes in v3:
- Address other cleanups suggested by Ingo and Linus such as passing the
printf_spec struct to the format helpers: pointer()/number()/string()
Note that this struct is passed by copy and not by address. This is to
avoid side effects because these functions often change these values and the
changes shoudn't be persistant when a callee helper returns.
It would be too risky.
- Various cleanups (code alignement, switch/case instead of if/else fountains).
- Fix a bug that printed the first format specifier following a %p
Changes in v4:
- drop unapropriate const qualifier loss while casting fmt to a char *
(thanks to Vegard Nossum for having pointed this out).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-6-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new APIs for binary trace printk infrastructure
vbin_printf(): write args to binary buffer, string is copied
when "%s" is occurred.
bstr_printf(): read from binary buffer for args and format a string
[fweisbec@gmail.com: rebase]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1236356510-8381-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Use test_tsk_need_resched(), set_tsk_need_resched(), need_resched()
instead of using TIF_NEED_RESCHED.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <49B10BA4.9070209@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>