Impact: cleanup
lib/decompress_unlzma.c depends on slab.h without including it:
CC lib/decompress_unlzma.o
lib/decompress_unlzma.c: In function ‘rc_free’:
lib/decompress_unlzma.c:122: error: implicit declaration of function ‘kfree’
lib/decompress_unlzma.c: In function ‘unlzma’:
lib/decompress_unlzma.c:551: error: implicit declaration of function ‘kmalloc’
lib/decompress_unlzma.c:551: warning: assignment makes pointer from integer without a cast
make[1]: *** [lib/decompress_unlzma.o] Error 1
make: *** [lib/] Error 2
It gets included implicitly currently - but this will not be the
case with upcoming kmemtrace changes.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <1237886521.25315.58.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
lib/decompress_bunzip2.c depends on slab.h without including it:
CC lib/decompress_bunzip2.o
lib/decompress_bunzip2.c: In function ‘start_bunzip’:
lib/decompress_bunzip2.c:636: error: implicit declaration of function ‘kmalloc’
lib/decompress_bunzip2.c:636: warning: assignment makes pointer from integer without a cast
lib/decompress_bunzip2.c: In function ‘bunzip2’:
lib/decompress_bunzip2.c:682: warning: assignment makes pointer from integer without a cast
lib/decompress_bunzip2.c:693: warning: assignment makes pointer from integer without a cast
lib/decompress_bunzip2.c:726: error: implicit declaration of function ‘kfree’
make[1]: *** [lib/decompress_bunzip2.o] Error 1
make: *** [lib/] Error 2
It gets included implicitly currently - but this will not be the
case with upcoming kmemtrace changes.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <1237886032.25315.48.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix build
lib/decompress_inflate.c depends on slab.h without including it:
CC lib/decompress_inflate.o
lib/decompress_inflate.c: In function ‘gunzip’:
lib/decompress_inflate.c:45: error: implicit declaration of function ‘kmalloc’
lib/decompress_inflate.c:45: warning: assignment makes pointer from integer without a cast
lib/decompress_inflate.c:57: warning: assignment makes pointer from integer without a cast
lib/decompress_inflate.c:65: warning: assignment makes pointer from integer without a cast
lib/decompress_inflate.c:71: warning: assignment makes pointer from integer without a cast
lib/decompress_inflate.c:154: error: implicit declaration of function ‘kfree’
make[1]: *** [lib/decompress_inflate.o] Error 1
make: *** [lib/] Error 2
It gets included implicitly currently - but this will not be the
case with upcoming kmemtrace changes.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <1237886030.25315.47.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix slab corruption caused by alloc_cpumask_var_node() overwriting the
tail end of an off-stack cpumask.
The function zeros out cpumask bits beyond the last possible cpu. The
starting point for zeroing should be the beginning of the mask offset by a
byte count derived from the number of possible cpus. The offset was
calculated in bits instead of bytes. This resulted in overwriting the end
of the cpumask.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Mike Travis <travis.sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: <stable@kernel.org> [2.6.29.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch for Per-CSS(Cgroup Subsys State) ID and private hierarchy code.
This patch attaches unique ID to each css and provides following.
- css_lookup(subsys, id)
returns pointer to struct cgroup_subysys_state of id.
- css_get_next(subsys, id, rootid, depth, foundid)
returns the next css under "root" by scanning
When cgroup_subsys->use_id is set, an id for css is maintained.
The cgroup framework only parepares
- css_id of root css for subsys
- id is automatically attached at creation of css.
- id is *not* freed automatically. Because the cgroup framework
don't know lifetime of cgroup_subsys_state.
free_css_id() function is provided. This must be called by subsys.
There are several reasons to develop this.
- Saving space .... For example, memcg's swap_cgroup is array of
pointers to cgroup. But it is not necessary to be very fast.
By replacing pointers(8bytes per ent) to ID (2byes per ent), we can
reduce much amount of memory usage.
- Scanning without lock.
CSS_ID provides "scan id under this ROOT" function. By this, scanning
css under root can be written without locks.
ex)
do {
rcu_read_lock();
next = cgroup_get_next(subsys, id, root, &found);
/* check sanity of next here */
css_tryget();
rcu_read_unlock();
id = found + 1
} while(...)
Characteristics:
- Each css has unique ID under subsys.
- Lifetime of ID is controlled by subsys.
- css ID contains "ID" and "Depth in hierarchy" and stack of hierarchy
- Allowed ID is 1-65535, ID 0 is UNUSED ID.
Design Choices:
- scan-by-ID v.s. scan-by-tree-walk.
As /proc's pid scan does, scan-by-ID is robust when scanning is done
by following kind of routine.
scan -> rest a while(release a lock) -> conitunue from interrupted
memcg's hierarchical reclaim does this.
- When subsys->use_id is set, # of css in the system is limited to
65535.
[bharata@linux.vnet.ibm.com: remove rcu_read_lock() from css_get_next()]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tfour 4 redundant if-conditions in function __rb_erase_color() in
lib/rbtree.c are removed.
In pseudo-source-code, the structure of the code is as follows:
if ((!A || B) && (!C || D)) {
.
.
.
} else {
if (!C || D) {//if this is true, it implies: (A == true) && (B == false)
if (A) {//hence this always evaluates to 'true'...
.
}
.
//at this point, C always becomes true, because of:
__rb_rotate_right/left();
//and:
other = parent->rb_right/left;
}
.
.
if (C) {//...and this too !
.
}
}
Signed-off-by: Wolfram Strepp <wstrepp@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_DEBUG_PAGEALLOC is now supported by x86, powerpc, sparc64, and
s390. This patch implements it for the rest of the architectures by
filling the pages with poison byte patterns after free_pages() and
verifying the poison patterns before alloc_pages().
This generic one cannot detect invalid page accesses immediately but
invalid read access may cause invalid dereference by poisoned memory and
invalid write access can be detected after a long delay.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix printk format warnings in dma-debug:
lib/dma-debug.c:645: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
lib/dma-debug.c:662: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
lib/dma-debug.c:676: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
lib/dma-debug.c:686: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow simple quoting of words in the dynamic debug control language.
This allows more natural specification when using the control language
to match against printk formats, e.g
#echo -n 'format "Setting node for non-present cpu" +p' >
/mnt/debugfs/dynamic_debug/control
instead of
#echo -n 'format Setting\040node\040for\040non-present\040cpu +p' >
/mnt/debugfs/dynamic_debug/control
Adjust the dynamic debug documention to describe that and provide a
new example. Adjust the existing examples in the documentation to
reflect the current whitespace escaping behaviour when reading the
control file. Fix some minor documentation trailing whitespace.
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch combines Greg Bank's dprintk() work with the existing dynamic
printk patchset, we are now calling it 'dynamic debug'.
The new feature of this patchset is a richer /debugfs control file interface,
(an example output from my system is at the bottom), which allows fined grained
control over the the debug output. The output can be controlled by function,
file, module, format string, and line number.
for example, enabled all debug messages in module 'nf_conntrack':
echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control
to disable them:
echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control
A further explanation can be found in the documentation patch.
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Right now, the kobject_uevent code blocks for each uevent that's being
generated, due to using (for hystoric reasons) UHM_WAIT_EXEC as flag to
call_usermode_helper(). Specifically, the effect is that each uevent
that is being sent causes the code to wake up keventd, then block until
keventd has processed the work. Needless to say, this happens many times
during the system boot.
This patches changes that to UHN_NO_WAIT (brilliant name for a constant
btw) so that we only schedule the work to fire the uevent message, but
do not wait for keventd to process the work.
This removes one of the bottlenecks during boot; each one of them is
only a small effect, but the sum of them does add up.
[Note, distros that need this are broken, they should be setting
CONFIG_UEVENT_HELPER_PATH to "", that way this code path will never be
excuted at all -- gregkh]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch implements uevent suppress in kobject and removes it
from struct device, based on the following ideas:
1,Uevent sending should be one attribute of kobject, so suppressing it
in kobject layer is more natural than in device layer. By this way,
we can do it for other objects embedded with kobject.
2,It may save several bytes for each instance of struct device.(On my
omap3(32bit ARM) based box, can save 8bytes per device object)
This patch also introduces dev_set|get_uevent_suppress() helpers to
set and query uevent_suppress attribute in case to help kobject
as private part of struct device in future.
[This version is against the latest driver-core patch set of Greg,please
ignore the last version.]
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that all users of bus_id is gone, we can remove it from struct
device.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Impact: extend on-kernel-stack DMA debug checks to all !highmem pages
We only checked dma_map_single() - extend it to dma_map_page()
and dma_map_sg() as well.
Also, fix dma_map_single() corner case bug: make sure we dont
stack-check highmem (not mapped) pages.
Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
LKML-Reference: <1237818908-26516-1-git-send-email-joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: allow architectures to monitor busses for dma mem leakage
This patch adds checking code to detect if a device has pending DMA
operations when it is about to be unbound from its device driver.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Impact: get notified if a device dma maps illegal areas
This patch adds a check to print a warning message when a device driver
tries to map a memory area from the kernel text segment or rodata.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Impact: saves stacktrace of a dma mapping and prints it if there is an error
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This adds a function to dump the DMA mappings that the debugging code is
aware of -- either for a single device, or for _all_ devices.
This can be useful for debugging -- sticking a call to it in the DMA
page fault handler, for example, to see if the faulting address _should_
be mapped or not, and hence work out whether it's IOMMU bugs we're
seeing, or driver bugs.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Impact: avoid recursive kfree calls, less slab activity on heavy load
debugobjects checks on kfree whether tracked objects are freed. When a
tracked object is freed debugobjects frees the internal reference
object as well. The debug object slab cache is marked to not recurse
into debugobjects when a slab objects is freed, but the recursive call
can be problematic versus locking in the memory allocator.
Defer the freeing of debug slab objects via schedule_work. The reasons
not to use RCU are:
1) rcu makes the data structure larger
2) there is no real need for rcu as nothing references the obj after
we freed it
3) under heavy load it is easier to reuse the to be freed objects instead
of allocating new objects from the slab. This lowered the slab activity
significantly in a heavy load networking test where lots of timers are
created/destroyed. The workqueue based delayed free allows us just to
put the to be freed objects back into the object pool and reuse them
right away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <200903162049.58058.nickpiggin@yahoo.com.au>
Impact: refactor/consolidate object management, prepare for delayed free
debugobjects allocates static reference objects to track objects which
are initialized or activated before the slab cache becomes
available. These static reference objects have to be handled
seperately in free_object(). The handling of these objects is in the
way of implementing a delayed free functionality. The delayed free is
required to avoid callbacks into the mm code from
debug_check_no_obj_freed().
Replace the static object references with dynamic ones after the slab
cache has been initialized. The static objects are now marked initdata.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <200903162049.58058.nickpiggin@yahoo.com.au>
Jeremy Fitzhardinge reported:
> Change fef20d9c13, "vsprintf:
> unify the format decoding layer for its 3 users", causes a
> regression in xenbus which results in no devices getting
> attached to a new domain.
%.*s is broken - fix it.
Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Guennadi Liakhovetski noticed that the end condition for the loop in
bitmap_find_free_region() is wrong, and the "return if error" was also
using the wrong conditional that would only trigger if the bitmap was an
exact multiple of the allocation size, which is not necessarily the case
with dma_alloc_from_coherent().
Such a failure would end up in bitmap_find_free_region() accessing
beyond the end of the bitmap.
Reported-by: Guennadi Liakhovetski <lg@denx.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: cleanup
The naming clashes with upcoming softirq tracepoints, so rename the
APIs to lockdep_*().
Requested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We moved the netlink attribute support from net to lib in order
for it to be available for general consumption. However, parts
of the code (the bits that we don't need :) really depends on
NET because the target object is sk_buff.
This patch fixes this by wrapping them in CONFIG_NET.
Some EXPORTs have been moved to make this work.
Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix a problem in the IDR system, where an idr_remove_all() hands a data
element to call_rcu() (via free_layer()) before making that data element
inaccessible to new readers. This is very bad, and results in readers
still having a reference to this data element at the end of the grace
period.
Tests on large machines that concurrently map and unmap user-space memory
within the same multithreaded process result in crashes within about five
minutes. Applying this patch increases the kernel's longevity to the
three-to-eight-hour range.
There appear to be other similar problems in idr_get_empty_slot() and
sub_remove(), but I fixed the easy one in idr_remove_all() first. It is
therefore no surprise that failures still occur.
Located-by: Milton Miller II <miltonm@austin.ibm.com>
Tested-by: Milton Miller II <miltonm@austin.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sitsofe Wheeler found and bisected that while unifying the
vsprintf format decoding in:
fef20d9: vsprintf: unify the format decoding layer for its 3 users
The sign flag has been dropped out in favour of
precise types (ie: LONG/ULONG).
But the format helper number() still needs this flag to keep track of
the signedness unless it will consider all numbers as unsigned.
Also add an explicit cast to int (for %d) while parsing with va_arg()
to ensure the highest bit is well extended on the 64 bits number that
hosts the value in case of negative values.
Reported-Bisected-Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20090309201503.GA5010@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
An new optimization is making its way to ftrace. Its purpose is to
make trace_printk() consuming less memory and be faster.
Written by Lai Jiangshan, the approach is to delay the formatting
job from tracing time to output time.
Currently, a call to trace_printk() will format the whole string and
insert it into the ring buffer. Then you can read it on /debug/tracing/trace
file.
The new implementation stores the address of the format string and
the binary parameters into the ring buffer, making the packet more compact
and faster to insert.
Later, when the user exports the traces, the format string is retrieved
with the binary parameters and the formatting job is eventually done.
The new implementation rewrites a lot of format decoding bits from
vsnprintf() function, making now 3 differents functions to maintain
in their duplicated parts of printf format decoding bits.
Suggested by Ingo Molnar, this patch tries to factorize the most
possible common bits from these functions.
The real common part between them is the format decoding. Although
they do somewhat similar jobs, their way to export or import the parameters
is very different. Thus, only the decoding layer is extracted, unless you see
other parts that could be worth factorized.
Changes in V2:
- Address a suggestion from Linus to group the format_decode() parameters inside
a structure.
Changes in v3:
- Address other cleanups suggested by Ingo and Linus such as passing the
printf_spec struct to the format helpers: pointer()/number()/string()
Note that this struct is passed by copy and not by address. This is to
avoid side effects because these functions often change these values and the
changes shoudn't be persistant when a callee helper returns.
It would be too risky.
- Various cleanups (code alignement, switch/case instead of if/else fountains).
- Fix a bug that printed the first format specifier following a %p
Changes in v4:
- drop unapropriate const qualifier loss while casting fmt to a char *
(thanks to Vegard Nossum for having pointed this out).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-6-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new APIs for binary trace printk infrastructure
vbin_printf(): write args to binary buffer, string is copied
when "%s" is occurred.
bstr_printf(): read from binary buffer for args and format a string
[fweisbec@gmail.com: rebase]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1236356510-8381-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Use test_tsk_need_resched(), set_tsk_need_resched(), need_resched()
instead of using TIF_NEED_RESCHED.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <49B10BA4.9070209@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Require framepointers for x86, because otherwise we'll be having
empty stack traces, which is useless.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1236167295.5330.7240.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Netlink attribute parsing may be used even if CONFIG_NET is not set.
Move it from net/netlink to lib and control its inclusion based on the new
config symbol CONFIG_NLATTR, which is selected by CONFIG_NET.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
While trying to debug why my Atom netbook is falling over booting
rawhide debug-enabled kernels, I stumbled across the fact that we've
been enabling object debugging by default. However, once you default it
to on, you've got no way to turn it back off again at runtime.
Add a boolean toggle to turn it off. I would just make it an int
module_param, however people may already expect the boolean enable
behaviour, so just add an analogue for disabling.
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
BUILD_DOCSRC should be controlled by "config" instead of "menuconfig".
I have no idea how I managed to use "menuconfig" here.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>