The following adds two more bitmap operators, bitmap_onto() and bitmap_fold(),
with the usual cpumask and nodemask wrappers.
The bitmap_onto() operator computes one bitmap relative to another. If the
n-th bit in the origin mask is set, then the m-th bit of the destination mask
will be set, where m is the position of the n-th set bit in the relative mask.
The bitmap_fold() operator folds a bitmap into a second that has bit m set iff
the input bitmap has some bit n set, where m == n mod sz, for the specified sz
value.
There are two substantive changes between this patch and its
predecessor bitmap_relative:
1) Renamed bitmap_relative() to be bitmap_onto().
2) Added bitmap_fold().
The essential motivation for bitmap_onto() is to provide a mechanism for
converting a cpuset-relative CPU or Node mask to an absolute mask. Cpuset
relative masks are written as if the current task were in a cpuset whose CPUs
or Nodes were just the consecutive ones numbered 0..N-1, for some N. The
bitmap_onto() operator is provided in anticipation of adding support for the
first such cpuset relative mask, by the mbind() and set_mempolicy() system
calls, using a planned flag of MPOL_F_RELATIVE_NODES. These bitmap operators
(and their nodemask wrappers, in particular) will be used in code that
converts the user specified cpuset relative memory policy to a specific system
node numbered policy, given the current mems_allowed of the tasks cpuset.
Such cpuset relative mempolicies will address two deficiencies
of the existing interface between cpusets and mempolicies:
1) A task cannot at present reliably establish a cpuset
relative mempolicy because there is an essential race
condition, in that the tasks cpuset may be changed in
between the time the task can query its cpuset placement,
and the time the task can issue the applicable mbind or
set_memplicy system call.
2) A task cannot at present establish what cpuset relative
mempolicy it would like to have, if it is in a smaller
cpuset than it might have mempolicy preferences for,
because the existing interface only allows specifying
mempolicies for nodes currently allowed by the cpuset.
Cpuset relative mempolicies are useful for tasks that don't distinguish
particularly between one CPU or Node and another, but only between how many of
each are allowed, and the proper placement of threads and memory pages on the
various CPUs and Nodes available.
The motivation for the added bitmap_fold() can be seen in the following
example.
Let's say an application has specified some mempolicies that presume 16 memory
nodes, including say a mempolicy that specified MPOL_F_RELATIVE_NODES (cpuset
relative) nodes 12-15. Then lets say that application is crammed into a
cpuset that only has 8 memory nodes, 0-7. If one just uses bitmap_onto(),
this mempolicy, mapped to that cpuset, would ignore the requested relative
nodes above 7, leaving it empty of nodes. That's not good; better to fold the
higher nodes down, so that some nodes are included in the resulting mapped
mempolicy. In this case, the mempolicy nodes 12-15 are taken modulo 8 (the
weight of the mems_allowed of the confining cpuset), resulting in a mempolicy
specifying nodes 4-7.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: <kosaki.motohiro@jp.fujitsu.com>
Cc: <ray-lk@madrabbit.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Migrate flags must be set on slab creation as agreed upon when the antifrag
logic was reviewed. Otherwise some slabs of a slabcache will end up in the
unmovable and others in the reclaimable section depending on which flag was
active when a new slab page was allocated.
This likely slid in somehow when antifrag was merged. Remove it.
The buffer_heads are always allocated with __GFP_RECLAIMABLE because the
SLAB_RECLAIM_ACCOUNT option is set. The set_migrateflags() never had any
effect there.
Radix tree allocations are not directly reclaimable but they are allocated
with __GFP_RECLAIMABLE set on each allocation. We now set
SLAB_RECLAIM_ACCOUNT on radix tree slab creation making sure that radix
tree slabs are consistently placed in the reclaimable section. Radix tree
slabs will also be accounted as such.
There is then no user left of set_migratepages. So remove it.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce GENERIC_FIND_FIRST_BIT and GENERIC_FIND_NEXT_BIT in
lib/Kconfig, defaulting to off. An arch that wants to use the
generic implementation now only has to use a select statement
to include them.
I added an always-y option (X86_CPU) to arch/x86/Kconfig.cpu
and used that to select the generic search functions. This
way ARCH=um SUBARCH=i386 automatically picks up the change
too, and arch/um/Kconfig.i386 can therefore be simplified a
bit. ARCH=um SUBARCH=x86_64 does things differently, but
still compiles fine. It seems that a "def_bool y" always
wins over a "def_bool n"?
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Generic versions of __find_first_bit and __find_first_zero_bit
are introduced as simplified versions of __find_next_bit and
__find_next_zero_bit. Their compilation and use are guarded by
a new config variable GENERIC_FIND_FIRST_BIT.
The generic versions of find_first_bit and find_first_zero_bit
are implemented in terms of the newly introduced __find_first_bit
and __find_first_zero_bit.
This patch does not remove the i386-specific implementation,
but it does switch i386 to use the generic functions by setting
GENERIC_FIND_FIRST_BIT=y for X86_32.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This moves an optimization for searching constant-sized small
bitmaps form x86_64-specific to generic code.
On an i386 defconfig (the x86#testing one), the size of vmlinux hardly
changes with this applied. I have observed only four places where this
optimization avoids a call into find_next_bit:
In the functions return_unused_surplus_pages, alloc_fresh_huge_page,
and adjust_pool_surplus, this patch avoids a call for a 1-bit bitmap.
In __next_cpu a call is avoided for a 32-bit bitmap. That's it.
On x86_64, 52 locations are optimized with a minimal increase in
code size:
Current #testing defconfig:
146 x bsf, 27 x find_next_*bit
text data bss dec hex filename
5392637 846592 724424 6963653 6a41c5 vmlinux
After removing the x86_64 specific optimization for find_next_*bit:
94 x bsf, 79 x find_next_*bit
text data bss dec hex filename
5392358 846592 724424 6963374 6a40ae vmlinux
After this patch (making the optimization generic):
146 x bsf, 27 x find_next_*bit
text data bss dec hex filename
5392396 846592 724424 6963412 6a40d4 vmlinux
[ tglx@linutronix.de: build fixes ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The versions with inline assembly are in fact slower on the machines I
tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, AMD
Opteron 270). The i386-version needed a fix similar to 06024f21 to avoid
crashing the benchmark.
Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size
1...512, for each possible bitmap with one bit set, for each possible
offset: find the position of the first bit starting at offset. If you
follow ;). Times include setup of the bitmap and checking of the
results.
Athlon Xeon Opteron 32/64bit
x86-specific: 0m3.692s 0m2.820s 0m3.196s / 0m2.480s
generic: 0m2.622s 0m1.662s 0m2.100s / 0m1.572s
If the bitmap size is not a multiple of BITS_PER_LONG, and no set
(cleared) bit is found, find_next_bit (find_next_zero_bit) returns a
value outside of the range [0, size]. The generic version always returns
exactly size. The generic version also uses unsigned long everywhere,
while the x86 versions use a mishmash of int, unsigned (int), long and
unsigned long.
Using the generic version does give a slightly bigger kernel, though.
defconfig: text data bss dec hex filename
x86-specific: 4738555 481232 626688 5846475 5935cb vmlinux (32 bit)
generic: 4738621 481232 626688 5846541 59360d vmlinux (32 bit)
x86-specific: 5392395 846568 724424 6963387 6a40bb vmlinux (64 bit)
generic: 5392458 846568 724424 6963450 6a40fa vmlinux (64 bit)
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add option to enable -Wframe-larger-than= on gcc 4.4
gcc mainline (upcoming 4.4) added a new -Wframe-larger-than=...
option to warn at build time about too large stack frames. Add a config
option to enable this warning, since this very useful for the kernel.
I choose (somewhat arbitarily) 2048 as default warning threshold for 64bit
and 1024 as default for 32bit architectures. With some research and
fixing all the code for smaller values these defaults should be probably
lowered.
With the default allyesconfigs have some new warnings, but I think
that is all code that should be just fixed.
At some point (when gcc 4.4 is released and widely used) this should
obsolete make checkstack
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Changeset d9024df02f ("[LMB] Restructure
allocation loops to avoid unsigned underflow") removed the alignment
of the 'size' argument to call lmb_add_region() done by __lmb_alloc_base().
In doing so it reintroduced the bug fixed by changeset
eea89e13a9 ("[LMB]: Fix bug in
__lmb_alloc_base().").
This puts it back.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the more concise list_for_each_entry(), which allows for the
deletion of the to_kobj() routine at the same time.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add warnings to kobject_put() to catch kobjects that are cleaned up but
were never initialized to begin with.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add a new function cpumask_scnprintf_len() to return the number of
characters needed to display "len" cpumask bits. The current method
of allocating NR_CPUS bytes is incorrect as what's really needed is
9 characters per 32-bit word of cpumask bits (8 hex digits plus the
seperator [','] or the terminating NULL.) This function provides the
caller the means to allocate the correct string length.
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There have been a few oopses caused by 'struct file's with NULL f_vfsmnts.
There was also a set of potentially missed mnt_want_write()s from
dentry_open() calls.
This patch provides a very simple debugging framework to catch these kinds of
bugs. It will WARN_ON() them, but should stop us from having any oopses or
mnt_writer count imbalances.
I'm quite convinced that this is a good thing because it found bugs in the
stuff I was working on as soon as I wrote it.
[hch: made it conditional on a debug option.
But it's still a little bit too ugly]
[hch: merged forced remount r/o fix from Dave and akpm's fix for the fix]
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This way firewire-ohci can be used for remote debugging like ohci1394.
Version with amendment from Fri, 11 Apr 2008 00:08:08 +0200.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Bernhard Kaindl <bk@suse.de>
This patch adds in the ability to compile the kgdb internal test
string into the kernel so as to run the tests at boot without changing
the kernel boot arguments. This patch also changes all the error
paths to invoke WARN_ON(1) which will emit the line number of the file
and dump the kernel stack when an error occurs.
You can disable the tests in a kernel that is built this way
using "kgdbts="
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds regression tests for testing the kgdb core and arch
specific implementation.
The kgdb test suite is designed to be built into the kernel and not as
a module because it uses a number of low level kernel and kgdb
primitives which should not be exported externally.
The kgdb test suite is designed as a KGDB I/O module which
simulates the communications that a debugger would have with kgdb.
The tests are broken up in to a line by line and referenced here as
a "get" which is kgdb requesting input and "put" which is kgdb
sending a response.
The kgdb suite can be invoked from the kernel command line
arguments system or executed dynamically at run time. The test
suite uses the variable "kgdbts" to obtain the information about
which tests to run and to configure the verbosity level. The
following are the various characters you can use with the kgdbts=
line:
When using the "kgdbts=" you only choose one of the following core
test types:
A = Run all the core tests silently
V1 = Run all the core tests with minimal output
V2 = Run all the core tests in debug mode
You can also specify optional tests:
N## = Go to sleep with interrupts of for ## seconds
to test the HW NMI watchdog
F## = Break at do_fork for ## iterations
S## = Break at sys_open for ## iterations
NOTE: that the do_fork and sys_open tests are mutually exclusive.
To invoke the kgdb test suite from boot you use a kernel start
argument as follows:
kgdbts=V1 kgdbwait
Or if you wanted to perform the NMI test for 6 seconds and do_fork
test for 100 forks, you could use:
kgdbts=V1N6F100 kgdbwait
The test suite can also be invoked at run time with:
echo kgdbts=V1N6F100 > /sys/module/kgdbts/parameters/kgdbts
Or as another example:
echo kgdbts=V2 > /sys/module/kgdbts/parameters/kgdbts
When developing a new kgdb arch specific implementation or
using these tests for the purpose of regression testing,
several invocations are required.
1) Boot with the test suite enabled by using the kernel arguments
"kgdbts=V1F100 kgdbwait"
## If kgdb arch specific implementation has NMI use
"kgdbts=V1N6F100
2) After the system boot run the basic test.
echo kgdbts=V1 > /sys/module/kgdbts/parameters/kgdbts
3) Run the concurrency tests. It is best to use n+1
while loops where n is the number of cpus you have
in your system. The example below uses only two
loops.
## This tests break points on sys_open
while [ 1 ] ; do find / > /dev/null 2>&1 ; done &
while [ 1 ] ; do find / > /dev/null 2>&1 ; done &
echo kgdbts=V1S10000 > /sys/module/kgdbts/parameters/kgdbts
fg # and hit control-c
fg # and hit control-c
## This tests break points on do_fork
while [ 1 ] ; do date > /dev/null ; done &
while [ 1 ] ; do date > /dev/null ; done &
echo kgdbts=V1F1000 > /sys/module/kgdbts/parameters/kgdbts
fg # and hit control-c
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
kgdb core code. Handles the protocol and the arch details.
[ mingo@elte.hu: heavily modified, simplified and cleaned up. ]
[ xemul@openvz.org: use find_task_by_pid_ns ]
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Alpha and FRV mutexes had an option to print lots of debugging messages
in their semaphore implementation. This feature has not been carried
over to the generic semaphores, so remove the stale Kconfig option.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility. Thanks to Peter Zijlstra for fixing the lockdep
warning. Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
kernel_lock.c uses DECLARE_MUTEX, up() and down() without explicitly
including asm/semaphore.h. This is fragile and leaves it vulnerable
to breakage during header reorganisations.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
There is a potential bug in __lmb_alloc_base where we subtract `size'
from the base address of a reserved region without checking whether
the subtraction could wrap around and produce a very large unsigned
value. In fact it probably isn't possible to hit the bug in practice
since it would only occur in the situation where we can't satisfy the
allocation request and there is a reserved region starting at 0.
This fixes the potential bug by breaking out of the loop when we get
to the point where the base of the reserved region is less than the
size requested. This also restructures the loop to be a bit easier to
follow.
The same logic got copied into lmb_alloc_nid_unreserved, so this makes
a similar change there. Here the bug is more likely to be hit because
the outer loop (in lmb_alloc_nid) goes through the memory regions in
increasing order rather than decreasing order as __lmb_alloc_base
does, and we are therefore more likely to hit the case where we are
testing against a reserved region with a base address of 0.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This makes no semantic changes. It fixes the whitespace and formatting
a bit, gets rid of a local DBG macro and uses the equivalent pr_debug
instead, and restructures one while loop that had a function call and
assignment in the condition to be a bit more readable. Some comments
about functions being called with relocation disabled were also removed
as they would just be confusing to most readers now that the code is
in lib/.
Signed-off-by: Paul Mackerras <paulus@samba.org>
A variant of lmb_alloc() that tries to allocate memory on a specified
NUMA node 'nid' but falls back to normal lmb_alloc() if that fails.
The caller provides a 'nid_range' function pointer which assists the
allocator. It is given args 'start', 'end', and pointer to integer
'this_nid'.
It places at 'this_nid' the NUMA node id that corresponds to 'start',
and returns the end address within 'start' to 'end' at which memory
assosciated with 'nid' ends.
This callback allows a platform to use lmb_alloc_nid() in just
about any context, even ones in which early_pfn_to_nid() might
not be working yet.
This function will be used by the NUMA setup code on sparc64, and also
it can be used by powerpc, replacing it's hand crafted
"careful_allocation()" function in arch/powerpc/mm/numa.c
If x86 ever converts it's NUMA support over to using the LMB helpers,
it can use this too as it has something entirely similar.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
count_partial() is used by both slabinfo and the sysfs proc support. Move
the function directly before the beginning of the sysfs code so that it can
be easily found. Rework the preprocessor conditional to take into account
that slub sysfs support depends on CONFIG_SYSFS *and* CONFIG_SLUB_DEBUG.
Make CONFIG_SLUB_STATS depend on CONFIG_SLUB_DEBUG and CONFIG_SYSFS. There
is no point of keeping statistics if no one can restrive them.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Shift of a LE value seems strange, probably meant to shift the cpu-order
variable as in the prvious section of the switch statement.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds new three helper functions to copy data between an SG
list and a linear buffer.
- sg_copy_from_buffer copies data from linear buffer to an SG list
- sg_copy_to_buffer copies data from an SG list to a linear buffer
When the APIs copy data from a linear buffer to an SG list,
flush_kernel_dcache_page is called. It's not necessary for everyone
but it's a no-op on most architectures and in general the API is not
used in performance critical path.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
- Let it update the state of all CPUs. The network stack goes
into pains to feed the current IP addresses in, but it is not very
effective if that is only done for some random CPU instead of all.
So change it to feed bits into all CPUs. I decided to do that lockless
because well somewhat random results are ok.
v2: Drop rename so that this patch doesn't depend on x86 maintainers
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This does not look good, but there is no other choice. The compilation
without CONFIG_NET is broken and can not be fixed with ease.
After that there is no need for the following commits:
1567ca7eec3edf8fa5cc2d38f9a4f8
Revert them.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Lord wrote:
>
> On boot, syslog is flooded with "uevent: unsupported action-string;" messages.
..
> Mar 28 14:43:29 shrimp kernel: tty ptyqd: uevent: unsupported
> action-string; this will be ignored in a future kernel version
> Mar 28 14:43:29 shrimp kernel: tty ptyqe: uevent: unsupported
> action-string; this will be ignored in a future kernel version
> Mar 28 14:43:29 shrimp kernel: tty ptyqf: uevent: unsupported
> action-string; this will be ignored in a future kernel version
> Mar 28 14:43:29 shrimp kernel: tty ptyr0: uevent: unsupported
> action-string; this will be ignored in a future kernel version
..
These messages are a regression compared with 2.6.24, which did not
flood the syslog with them.
The actual underlying problem was introduced in 2.6.23, when somebody
made the string parsing no longer accept nul-terminated strings as a
valid input to store_uevent().
Eg. "add\0" was valid prior to 2.6.23, where the code regressed to
require "add" without the '\0'.
This patch fixes the 2.6.23 / 2.6.24 regressions, by having the code
once again tolerate the trailing '\0', if present.
According to GregKH, this mainly affects older Ubuntu systems, such as
the one I have here that requires this fix.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The knock-out. The pcounter abstraction is not used any longer in the
kernel.
Not sure whether this should go via netdev tree, but as far as I
remember it was added via this one, and besides Eric thinks that
Andrew shouldn't mind this.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fix broken compilation for 'allnoconfig'. This was introduced by
Introduced by commit 1218854afa ("[NET]
NETNS: Omit seq_net_private->net without CONFIG_NET_NS.")
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It appears that 64-bit PCI resources cannot possibly ever have worked on
x86-32 even when the RESOURCES_64BIT config option was set, because any
driver that tried to [pci_]ioremap() the resource would have been unable
to do so because the high 32 bits would have been silently dropped on
the floor by the ioremap() routines that only used "unsigned long".
Change them to use "resource_size_t" instead, which properly encodes the
whole 64-bit resource data if RESOURCES_64BIT is enabled.
Acked-by: H. Peter Anvin <hpa@kernel.org>
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some drivers need to reserve all PCI BARs to prevent other drivers
misusing unoccupied BARs. pcim_iomap_regions_request_all() requests
all BARs and iomap specified BARs.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Commit 681cc5cd3e ("iommu sg merging:
swiotlb: respect the segment boundary limits") introduced two
possibilities for entering an endless loop in lib/swiotlb.c:
- if max_slots is zero (possible if mask is ~0UL)
- if the number of slots requested fits into a swiotlb segment, but is
too large for the part of a segment which remains after considering
offset_slots
This fixes them
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
iommu_is_span_boundary is used internally in the IOMMU helper
(lib/iommu-helper.c), a primitive function that judges whether a memory area
spans LLD's segment boundary or not.
It's difficult to convert some IOMMUs to use the IOMMU helper but
iommu_is_span_boundary is still useful for them. So this patch exports it.
This is needed for the parisc iommu fixes.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kset_initialize was calling kobject_init_internal() which didn't
initialize the kobject as well as kobject_init() was. So have
kobject_init() call kobject_init_internal() and move the logic to
initalize the kobject there.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
lib/vsprintf.c: Fix bug omitting minus sign of numbers (module_param)
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make LKDTM depend on BLOCK to prevent build failures with certain configs.
Signed-off-by: Chris Snook <csnook@redhat.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We introduced a bug in fixing lmb_add_region to handle an initial
region being non-zero. Before that fix it was impossible to insert a
region at the head of the list since the first region always started
at zero.
Now that its possible for the first region to be non-zero we need to
check to see if the new region should be added at the head and if so
actually add it.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We started to see patches enabling this - so explain why
it is disabled and the condition to enable it again.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Convert the lmb code to use u64 instead of unsigned long for physical
addresses and sizes. This is needed to support large amounts of RAM
on 32-bit systems that support 36-bit physical addressing.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we add to an empty lmb region with a non-zero base we will not
coalesce the number of regions down to one. This causes problems on
ppc32 for the memory region as its assumed to only have one region.
We can fix this be easily specially casing the initial add to just
replace the dummy region.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to check lmb_add_region() for errors, it can run out
of regions etc.
Also, the size needs to be padded to the given alignment
or else the lmb.reserved regions don't get expanded and
instead we get tons of holes and eventually run out of
regions prematurely.
Signed-off-by: David S. Miller <davem@davemloft.net>
In arch/x86/boot/printf.c gets rid of unused tail of digits: const char
*digits = "0123456789abcdefghijklmnopqrstuvwxyz"; (we are using 0-9a-f
only)
Uses smaller/faster lowercasing (by ORing with 0x20)
if we know that we work on numbers/digits. Makes
strtoul smaller, and also we are getting rid of
static const char small_digits[] = "0123456789abcdefx";
static const char large_digits[] = "0123456789ABCDEFX";
since this works equally well:
static const char digits[16] = "0123456789ABCDEF";
Size savings:
$ size vmlinux.org vmlinux
text data bss dec hex filename
877320 112252 90112 1079684 107984 vmlinux.org
877048 112252 90112 1079412 107874 vmlinux
It may be also a tiny bit faster because code has less
branches now, but I doubt it is measurable.
[ hugh@veritas.com: uppercase pointers fix ]
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Other than the defconfigs, remove the entry in compiler-gcc4.h,
Kconfig.debug and feature-removal-schedule.txt.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
lib/scatterlist.c is needed by drivers/media/video/videobuf-dma-sg.c, and
we would like to be able to use the latter without PCI too, for example, on
PXA270 ARM CPU. It is then possible to create a configuration with
CONFIG_BLOCK=n, where only module code will need scatterlist.c. Therefore
it must be in obj-y.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, for every sysfs node, the callers will be responsible for
implementing store operation, so many many callers are doing duplicate
things to validate input, they have the same mistakes because they are
calling simple_strtol/ul/ll/uul, especially for module params, they are
just numeric, but you can echo such values as 0x1234xxx, 07777888 and
1234aaa, for these cases, module params store operation just ignores
succesive invalid char and converts prefix part to a numeric although input
is acctually invalid.
This patch tries to fix the aforementioned issues and implements
strict_strtox serial functions, kernel/params.c uses them to strictly
validate input, so module params will reject such values as 0x1234xxxx and
returns an error:
write error: Invalid argument
Any modules which export numeric sysfs node can use strict_strtox instead of
simple_strtox to reject any invalid input.
Here are some test results:
Before applying this patch:
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo 0x1000 > /sys/module/e1000/parameters/copybreak
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo 0x1000g > /sys/module/e1000/parameters/copybreak
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo 0x1000gggggggg > /sys/module/e1000/parameters/copybreak
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo 010000 > /sys/module/e1000/parameters/copybreak
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo 0100008 > /sys/module/e1000/parameters/copybreak
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo 010000aaaaa > /sys/module/e1000/parameters/copybreak
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]#
After applying this patch:
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo 0x1000 > /sys/module/e1000/parameters/copybreak
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo 0x1000g > /sys/module/e1000/parameters/copybreak
-bash: echo: write error: Invalid argument
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo 0x1000gggggggg > /sys/module/e1000/parameters/copybreak
-bash: echo: write error: Invalid argument
[root@yangyi-dev /]# echo 010000 > /sys/module/e1000/parameters/copybreak
[root@yangyi-dev /]# echo 0100008 > /sys/module/e1000/parameters/copybreak
-bash: echo: write error: Invalid argument
[root@yangyi-dev /]# echo 010000aaaaa > /sys/module/e1000/parameters/copybreak
-bash: echo: write error: Invalid argument
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]# echo -n 4096 > /sys/module/e1000/parameters/copybreak
[root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
4096
[root@yangyi-dev /]#
[akpm@linux-foundation.org: fix compiler warnings]
[akpm@linux-foundation.org: fix off-by-one found by tiwai@suse.de]
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Cc: Greg KH <greg@kroah.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sometimes simple attributes might need to return an error, e.g. for
acquiring a mutex interruptibly. In fact we have that situation in
spufs already which is the original user of the simple attributes. This
patch merged the temporarily forked attributes in spufs back into the
main ones and allows to return errors.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: <stefano.brivio@polimi.it>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg KH <greg@kroah.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>