The upcoming GCC 9 release extends the -Wmissing-attributes warnings
(enabled by -Wall) to C and aliases: it warns when particular function
attributes are missing in the aliases but not in their target.
In particular, it triggers here because crc32_le_base/__crc32c_le_base
aren't __pure while their target crc32_le/__crc32c_le are.
These aliases are used by architectures as a fallback in accelerated
versions of CRC32. See commit 9784d82db3eb ("lib/crc32: make core crc32()
routines weak so they can be overridden").
Therefore, being fallbacks, it is likely that even if the aliases
were called from C, there wouldn't be any optimizations possible.
Currently, the only user is arm64, which calls this from asm.
Still, marking the aliases as __pure makes sense and is a good idea
for documentation purposes and possible future optimizations,
which also silences the warning.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Allow architectures to drop in accelerated CRC32 routines by making
the crc32_le/__crc32c_le entry points weak, and exposing non-weak
aliases for them that may be used by the accelerated versions as
fallbacks in case the instructions they rely upon are not available.
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit 60a3cdd063 ("x86: add optimized inlining") introduced
CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.
The idea is obviously arch-agnostic. This commit moves the config entry
from arch/x86/Kconfig.debug to lib/Kconfig.debug so that all
architectures can benefit from it.
This can make a huge difference in kernel image size especially when
CONFIG_OPTIMIZE_FOR_SIZE is enabled.
For example, I got 3.5% smaller arm64 kernel for v5.1-rc1.
dec file
18983424 arch/arm64/boot/Image.before
18321920 arch/arm64/boot/Image.after
This also slightly improves the "Kernel hacking" Kconfig menu as
e61aca5158 ("Merge branch 'kconfig-diet' from Dave Hansen') suggested;
this config option would be a good fit in the "compiler option" menu.
Link: http://lkml.kernel.org/r/20190423034959.13525-12-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Brezillon <bbrezillon@kernel.org>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marek Vasut <marek.vasut@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[kdrag0n: Backported to k4.14]
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
xxhash's performance depends heavily on compiler optimizations including
inlines. Follow upstream's behavior and inline those helper functions.
Signed-off-by: Juhyung Park <qkrwngud825@gmail.com>
Signed-off-by: Alexander Winkowski <dereference23@outlook.com>
Change-Id: Ieee439865dc48f21a97032a8009941331c302227
These simply wraps memcpy().
Replace it with macros so that it is naturally inlined.
Signed-off-by: Juhyung Park <qkrwngud825@gmail.com>
Signed-off-by: Alexander Winkowski <dereference23@outlook.com>
Change-Id: I67887a706816aff46339d43b2f28e97813c1d4af
- This fixes the below warnings:
In file included from ../fs/f2fs/super.c:28:
../include/linux/lz4.h:221:12: warning: 'LZ4_compress_fast' declared 'static' but never defined [-Wunused-function]
221 | static int LZ4_compress_fast(const char *source, char *dest, int inputSize,
| ^~~~~~~~~~~~~~~~~
../include/linux/lz4.h:245:12: warning: 'LZ4_compress_destSize' declared 'static' but never defined [-Wunused-function]
245 | static int LZ4_compress_destSize(const char *source, char *dest, int *sourceSizePtr,
| ^~~~~~~~~~~~~~~~~~~~~
../include/linux/lz4.h:361:13: warning: 'LZ4_resetStreamHC' declared 'static' but never defined [-Wunused-function]
361 | static void LZ4_resetStreamHC(LZ4_streamHC_t *streamHCPtr, int compressionLevel);
| ^~~~~~~~~~~~~~~~~
../include/linux/lz4.h:376:17: warning: 'LZ4_loadDictHC' declared 'static' but never defined [-Wunused-function]
376 | static int LZ4_loadDictHC(LZ4_streamHC_t *streamHCPtr, const char *dictionary,
| ^~~~~~~~~~~~~~
../include/linux/lz4.h:415:12: warning: 'LZ4_compress_HC_continue' declared 'static' but never defined [-Wunused-function]
415 | static int LZ4_compress_HC_continue(LZ4_streamHC_t *streamHCPtr, const char *src,
| ^~~~~~~~~~~~~~~~~~~~~~~~
../include/linux/lz4.h:434:12: warning: 'LZ4_saveDictHC' declared 'static' but never defined [-Wunused-function]
434 | static int LZ4_saveDictHC(LZ4_streamHC_t *streamHCPtr, char *safeBuffer,
| ^~~~~~~~~~~~~~
../include/linux/lz4.h:450:29: warning: 'LZ4_resetStream' declared 'static' but never defined [-Wunused-function]
450 | static __always_inline void LZ4_resetStream(LZ4_stream_t *LZ4_stream);
| ^~~~~~~~~~~~~~~
../include/linux/lz4.h:507:12: warning: 'LZ4_compress_fast_continue' declared 'static' but never defined [-Wunused-function]
507 | static int LZ4_compress_fast_continue(LZ4_stream_t *streamPtr, const char *src,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
../include/linux/lz4.h:521:12: warning: 'LZ4_setStreamDecode' declared 'static' but never defined [-Wunused-function]
521 | static int LZ4_setStreamDecode(LZ4_streamDecode_t *LZ4_streamDecode,
| ^~~~~~~~~~~~~~~~~~~
../include/linux/lz4.h:560:12: warning: 'LZ4_decompress_safe_continue' declared 'static' but never defined [-Wunused-function]
560 | static int LZ4_decompress_safe_continue(LZ4_streamDecode_t *LZ4_streamDecode,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
../include/linux/lz4.h:599:12: warning: 'LZ4_decompress_fast_continue' declared 'static' but never defined [-Wunused-function]
599 | static int LZ4_decompress_fast_continue(LZ4_streamDecode_t *LZ4_streamDecode,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
../include/linux/lz4.h:622:12: warning: 'LZ4_decompress_safe_usingDict' declared 'static' but never defined [-Wunused-function]
622 | static int LZ4_decompress_safe_usingDict(const char *source, char *dest,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../include/linux/lz4.h:645:12: warning: 'LZ4_decompress_fast_usingDict' declared 'static' but never defined [-Wunused-function]
645 | static int LZ4_decompress_fast_usingDict(const char *source, char *dest,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Ruchit <ruchitmarathe@gmail.com>
Most places in the kernel that we need to distinguish functions by the
type of their arguments, we use '_ul' as a suffix for the unsigned long
variant, not '_ext'. Also add kernel-doc.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Ruchit <ruchitmarathe@gmail.com>
All current users of idr_alloc_ext() actually want to allocate a u32
and idr_alloc_u32() fits their needs better.
Like idr_get_next(), it uses a 'nextid' argument which serves as both
a pointer to the start ID and the assigned ID (instead of a separate
minimum and pointer-to-assigned-ID argument). It uses a 'max' argument
rather than 'end' because the semantics that idr_alloc has for 'end'
don't work well for unsigned types.
Since idr_alloc_u32() returns an errno instead of the allocated ID, mark
it as __must_check to help callers use it correctly. Include copious
kernel-doc. Chris Mi <chrism@mellanox.com> has promised to contribute
test-cases for idr_alloc_u32.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Ruchit <ruchitmarathe@gmail.com>
Patch series "XArray", v9. (First part thereof).
This patchset is, I believe, appropriate for merging for 4.17. It
contains the XArray implementation, to eventually replace the radix
tree, and converts the page cache to use it.
This conversion keeps the radix tree and XArray data structures in sync
at all times. That allows us to convert the page cache one function at
a time and should allow for easier bisection. Other than renaming some
elements of the structures, the data structures are fundamentally
unchanged; a radix tree walk and an XArray walk will touch the same
number of cachelines. I have changes planned to the XArray data
structure, but those will happen in future patches.
Improvements the XArray has over the radix tree:
- The radix tree provides operations like other trees do; 'insert' and
'delete'. But what most users really want is an automatically
resizing array, and so it makes more sense to give users an API that
is like an array -- 'load' and 'store'. We still have an 'insert'
operation for users that really want that semantic.
- The XArray considers locking as part of its API. This simplifies a
lot of users who formerly had to manage their own locking just for
the radix tree. It also improves code generation as we can now tell
RCU that we're holding a lock and it doesn't need to generate as much
fencing code. The other advantage is that tree nodes can be moved
(not yet implemented).
- GFP flags are now parameters to calls which may need to allocate
memory. The radix tree forced users to decide what the allocation
flags would be at creation time. It's much clearer to specify them at
allocation time.
- Memory is not preloaded; we don't tie up dozens of pages on the off
chance that the slab allocator fails. Instead, we drop the lock,
allocate a new node and retry the operation. We have to convert all
the radix tree, IDA and IDR preload users before we can realise this
benefit, but I have not yet found a user which cannot be converted.
- The XArray provides a cmpxchg operation. The radix tree forces users
to roll their own (and at least four have).
- Iterators take a 'max' parameter. That simplifies many users and will
reduce the amount of iteration done.
- Iteration can proceed backwards. We only have one user for this, but
since it's called as part of the pagefault readahead algorithm, that
seemed worth mentioning.
- RCU-protected pointers are not exposed as part of the API. There are
some fun bugs where the page cache forgets to use rcu_dereference()
in the current codebase.
- Value entries gain an extra bit compared to radix tree exceptional
entries. That gives us the extra bit we need to put huge page swap
entries in the page cache.
- Some iterators now take a 'filter' argument instead of having
separate iterators for tagged/untagged iterations.
The page cache is improved by this:
- Shorter, easier to read code
- More efficient iterations
- Reduction in size of struct address_space
- Fewer walks from the top of the data structure; the XArray API
encourages staying at the leaf node and conducting operations there.
This patch (of 8):
None of these bits may be used for slab allocations, so we can use them
as radix tree flags as long as we mask them off before passing them to
the slab allocator. Move the IDR flag from the high bits to the
GFP_ZONEMASK bits.
Link: http://lkml.kernel.org/r/20180313132639.17387-3-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Ruchit <ruchitmarathe@gmail.com>
LZ4 final literal copy could be overlapped when doing
in-place decompression, so it's unsafe to just use memcpy()
on an optimized memcpy approach but memmove() instead.
Upstream LZ4 has updated this years ago [1] (and the impact
is non-sensible [2] plus only a few bytes remain), this commit
just synchronizes LZ4 upstream code to the kernel side as well.
It can be observed as EROFS in-place decompression failure
on specific files when X86_FEATURE_ERMS is unsupported,
memcpy() optimization of commit 59daa706fb ("x86, mem:
Optimize memcpy by avoiding memory false dependece") will
be enabled then.
Currently most modern x86-CPUs support ERMS, these CPUs just
use "rep movsb" approach so no problem at all. However, it can
still be verified with forcely disabling ERMS feature...
arch/x86/lib/memcpy_64.S:
ALTERNATIVE_2 "jmp memcpy_orig", "", X86_FEATURE_REP_GOOD, \
- "jmp memcpy_erms", X86_FEATURE_ERMS
+ "jmp memcpy_orig", X86_FEATURE_ERMS
We didn't observe any strange on arm64/arm/x86 platform before
since most memcpy() would behave in an increasing address order
("copy upwards" [3]) and it's the correct order of in-place
decompression but it really needs an update to memmove() for sure
considering it's an undefined behavior according to the standard
and some unique optimization already exists in the kernel.
[1] 33cb8518ac
[2] https://github.com/lz4/lz4/pull/717#issuecomment-497818921
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=12518
Link: https://lkml.kernel.org/r/20201122030749.2698994-1-hsiangkao@redhat.com
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Nick Terrell <terrelln@fb.com>
Cc: Yann Collet <yann.collet.73@gmail.com>
Cc: Miao Xie <miaoxie@huawei.com>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: Li Guifu <bluce.liguifu@huawei.com>
Cc: Guo Xuenan <guoxuenan@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 89b158635ad79574bde8e94d45dad33f8cf09549)
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>
Change-Id: Ib192a9a7cf891b201ff2be476e60fc01c8a33652
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Ruchit <ruchitmarathe@gmail.com>
This patch replaces all memcpy() calls with LZ4_memcpy() which calls
__builtin_memcpy() so the compiler can inline it.
LZ4 relies heavily on memcpy() with a constant size being inlined. In x86
and i386 pre-boot environments memcpy() cannot be inlined because memcpy()
doesn't get defined as __builtin_memcpy().
An equivalent patch has been applied upstream so that the next import
won't lose this change [1].
I've measured the kernel decompression speed using QEMU before and after
this patch for the x86_64 and i386 architectures. The speed-up is about
10x as shown below.
Code Arch Kernel Size Time Speed
v5.8 x86_64 11504832 B 148 ms 79 MB/s
patch x86_64 11503872 B 13 ms 885 MB/s
v5.8 i386 9621216 B 91 ms 106 MB/s
patch i386 9620224 B 10 ms 962 MB/s
I also measured the time to decompress the initramfs on x86_64, i386, and
arm. All three show the same decompression speed before and after, as
expected.
[1] https://github.com/lz4/lz4/pull/890
Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Yann Collet <yann.collet.73@gmail.com>
Cc: Gao Xiang <gaoxiang25@huawei.com>
Cc: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Arvind Sankar <nivedita@alum.mit.edu>
Link: http://lkml.kernel.org/r/20200803194022.2966806-1-nickrterrell@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b1a3e75e466d96383508634f3d2e477ac45f2fc1)
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>
Change-Id: I5345e2d16f7e552f437aae2015905ff121ad752d
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Ruchit <ruchitmarathe@gmail.com>
This operation was intentional, but tools such as smatch will warn that it
might not have been.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Yann Collet <cyan@fb.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Gao Xiang <hsiangkao@aol.com>
Link: http://lkml.kernel.org/r/3bf931c6ea0cae3e23f3485801986859851b4f04.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit e8ec04938c446e5f4dc53e0147f0f679995012ee)
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>
Change-Id: I0686513a159bcb74c7f22337fafb1c65782bf9cd
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Ruchit <ruchitmarathe@gmail.com>
Update the LZ4 compression module based on LZ4 v1.8.3 in order for the
erofs file system to use the newest LZ4_decompress_safe_partial() which
can now decode exactly the nb of bytes requested [1] to take place of the
open hacked code in the erofs file system itself.
Currently, apart from the erofs file system, no other users use
LZ4_decompress_safe_partial, so no worry about the interface.
In addition, LZ4 v1.8.x boosts up decompression speed compared to the
current code which is based on LZ4 v1.7.3, mainly due to shortcut
optimization for the specific common LZ4-sequences [2].
lzbench testdata (tested in kirin710, 8 cores, 4 big cores
at 2189Mhz, 2GB DDR RAM at 1622Mhz, with enwik8 testdata [3]):
Compressor name Compress. Decompress. Compr. size Ratio Filename
memcpy 5004 MB/s 4924 MB/s 100000000 100.00 enwik8
lz4hc 1.7.3 -9 12 MB/s 653 MB/s 42203253 42.20 enwik8
lz4hc 1.8.0 -9 12 MB/s 908 MB/s 42203096 42.20 enwik8
lz4hc 1.8.3 -9 11 MB/s 965 MB/s 42203094 42.20 enwik8
[1] https://github.com/lz4/lz4/issues/56608d347b5b2
[2] v1.8.1 perf: slightly faster compression and decompression speed
a31b7058cb
v1.8.2 perf: slightly faster HC compression and decompression speed
45f8603aae1a191b3f8d
[3] http://mattmahoney.net/dc/textdata.htmlhttp://mattmahoney.net/dc/enwik8.zip
Link: http://lkml.kernel.org/r/1537181207-21932-1-git-send-email-gaoxiang25@huawei.com
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Tested-by: Guo Xuenan <guoxuenan@huawei.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Yann Collet <yann.collet.73@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Fang Wei <fangwei1@huawei.com>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: Miao Xie <miaoxie@huawei.com>
Cc: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: <weidu.du@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Ruchit <ruchitmarathe@gmail.com>
commit be08815c5d3b25e53cd9b53a4d768d5f3d93ba25 upstream.
We have one triggering on eBPF but lets also add a cBPF example to
make sure we keep tracking them. Also add anther cBPF test running
max number of MSH ops.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Change-Id: I560a9a0097ab7f10cc54c12265e180e2de122e39
[ Upstream commit 2b7e9f25e590726cca76700ebdb10e92a7a72ca1 ]
Each test case can have a set of sub-tests, where each sub-test can
run the cBPF/eBPF test snippet with its own data_size and expected
result. Before, the end of the sub-test array was indicated by both
data_size and result being zero. However, most or all of the internal
eBPF tests has a data_size of zero already. When such a test also had
an expected value of zero, the test was never run but reported as
PASS anyway.
Now the test runner always runs the first sub-test, regardless of the
data_size and result values. The sub-test array zero-termination only
applies for any additional sub-tests.
There are other ways fix it of course, but this solution at least
removes the surprise of eBPF tests with a zero result always succeeding.
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721103822.3755111-1-johan.almbladh@anyfinetworks.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Change-Id: I597a79488a8c86dc6c514e4b003ae4dd44fb142e
[ Upstream commit ae7f47041d928b1a2f28717d095b4153c63cbf6a ]
This test now operates on DW as stated instead of W, which was
already covered by another test.
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721104058.3755254-1-johan.almbladh@anyfinetworks.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Change-Id: I823352fbf62cfdd588007c193495622d8bffa67d
LLVM implemented a recent "libcall optimization" that lowers calls to
`sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest,
str) - dest`. This generally avoids the machinery involved in parsing
format strings. `stpcpy` is just like `strcpy` except it returns the
pointer to the new tail of `dest`. This optimization was introduced into
clang-12.
Implement this so that we don't observe linkage failures due to missing
symbol definitions for `stpcpy`.
Similar to last year's fire drill with: commit 5f074f3e192f
("lib/string.c: implement a basic bcmp")
The kernel is somewhere between a "freestanding" environment (no full
libc) and "hosted" environment (many symbols from libc exist with the same
type, function signature, and semantics).
As H. Peter Anvin notes, there's not really a great way to inform the
compiler that you're targeting a freestanding environment but would like
to opt-in to some libcall optimizations (see pr/47280 below), rather than
opt-out.
Arvind notes, -fno-builtin-* behaves slightly differently between GCC and
Clang, and Clang is missing many __builtin_* definitions, which I consider
a bug in Clang and am working on fixing.
Masahiro summarizes the subtle distinction between compilers justly:
To prevent transformation from foo() into bar(), there are two ways in
Clang to do that; -fno-builtin-foo, and -fno-builtin-bar. There is
only one in GCC; -fno-buitin-foo.
(Any difference in that behavior in Clang is likely a bug from a missing
__builtin_* definition.)
Masahiro also notes:
We want to disable optimization from foo() to bar(),
but we may still benefit from the optimization from
foo() into something else. If GCC implements the same transform, we
would run into a problem because it is not -fno-builtin-bar, but
-fno-builtin-foo that disables that optimization.
In this regard, -fno-builtin-foo would be more future-proof than
-fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
may want to prevent calls from foo() being optimized into calls to
bar(), but we still may want other optimization on calls to foo().
It seems that compilers today don't quite provide the fine grain control
over which libcall optimizations pseudo-freestanding environments would
prefer.
Finally, Kees notes that this interface is unsafe, so we should not
encourage its use. As such, I've removed the declaration from any header,
but it still needs to be exported to avoid linkage errors in modules.
Link: https://lkml.kernel.org/r/20200825140001.2941001-1-ndesaulniers@google.com
Link: https://bugs.llvm.org/show_bug.cgi?id=47162
Link: https://bugs.llvm.org/show_bug.cgi?id=47280
Link: ClangBuiltLinux/linux#1126
Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
Link: https://reviews.llvm.org/D85963
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Suggested-by: Andy Lavr <andy.lavr@gmail.com>
Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
Suggested-by: Joe Perches <joe@perches.com>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: <stable@vger.kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Alexandru Ardelean <alexandru.ardelean@analog.com>
Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Change-Id: Id6bf5e796fdab4dd421519b9136c6bd9cd7cbe42
[ Upstream commit acaab7335bd6f0c0b54ce3a00bd7f18222ce0f5f ]
The zlib inflate code has an old micro-optimization based on the
assumption that for pre-increment memory accesses, the compiler will
generate code that fits better into the processor's pipeline than what
would be generated for post-increment memory accesses.
This optimization was already removed in upstream zlib in 2016:
https://github.com/madler/zlib/commit/9aaec95e8211
This optimization causes UB according to C99, which says in section 6.5.6
"Additive operators": "If both the pointer operand and the result point to
elements of the same array object, or one past the last element of the
array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined".
This UB is not only a theoretical concern, but can also cause trouble for
future work on compiler-based sanitizers.
According to the zlib commit, this optimization also is not optimal
anymore with modern compilers.
Replace uses of OFF, PUP and UP_UNALIGNED with their definitions in the
POSTINC case, and remove the macro definitions, just like in the upstream
patch.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Link: http://lkml.kernel.org/r/20200507123112.252723-1-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18f1ca46858eac22437819937ae44aa9a8f9f2fa ]
When building 64r6_defconfig with CONFIG_MIPS32_O32 disabled and
CONFIG_CRYPTO_RSA enabled:
lib/mpi/generic_mpih-mul1.c:37:24: error: invalid use of a cast in a
inline asm context requiring an l-value: remove the cast
or build with -fheinous-gnu-extensions
umul_ppmm(prod_high, prod_low, s1_ptr[j], s2_limb);
~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lib/mpi/longlong.h:664:22: note: expanded from macro 'umul_ppmm'
: "=d" ((UDItype)(w0))
~~~~~~~~~~^~~
lib/mpi/generic_mpih-mul1.c:37:13: error: invalid use of a cast in a
inline asm context requiring an l-value: remove the cast
or build with -fheinous-gnu-extensions
umul_ppmm(prod_high, prod_low, s1_ptr[j], s2_limb);
~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lib/mpi/longlong.h:668:22: note: expanded from macro 'umul_ppmm'
: "=d" ((UDItype)(w1))
~~~~~~~~~~^~~
2 errors generated.
This special case for umul_ppmm for MIPS64r6 was added in
commit bbc25bee37d2b ("lib/mpi: Fix umul_ppmm() for MIPS64r6"), due to
GCC being inefficient and emitting a __multi3 intrinsic.
There is no such issue with clang; with this patch applied, I can build
this configuration without any problems and there are no link errors
like mentioned in the commit above (which I can still reproduce with
GCC 9.3.0 when that commit is reverted). Only use this definition when
GCC is being used.
This really should have been caught by commit b0c091ae04f67 ("lib/mpi:
Eliminate unused umul_ppmm definitions for MIPS") when I was messing
around in this area but I was not testing 64-bit MIPS at the time.
Link: https://github.com/ClangBuiltLinux/linux/issues/885
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit ab10ae1c3bef56c29bac61e1201c752221b87b41 upstream.
The range passed to user_access_begin() by strncpy_from_user() and
strnlen_user() starts at 'src' and goes up to the limit of userspace
although reads will be limited by the 'count' param.
On 32 bits powerpc (book3s/32) access has to be granted for each
256Mbytes segment and the cost increases with the number of segments to
unlock.
Limit the range with 'count' param.
Fixes: 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 594cc251fdd0d231d342d88b2fdff4bc42fb0690 upstream.
Originally, the rule used to be that you'd have to do access_ok()
separately, and then user_access_begin() before actually doing the
direct (optimized) user access.
But experience has shown that people then decide not to do access_ok()
at all, and instead rely on it being implied by other operations or
similar. Which makes it very hard to verify that the access has
actually been range-checked.
If you use the unsafe direct user accesses, hardware features (either
SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged
Access Never - on ARM) do force you to use user_access_begin(). But
nothing really forces the range check.
By putting the range check into user_access_begin(), we actually force
people to do the right thing (tm), and the range check vill be visible
near the actual accesses. We have way too long a history of people
trying to avoid them.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af700eaed0564d5d3963a7a51cb0843629d7fe3d upstream.
objtool points out several conditions that it does not like, depending
on the combination with other configuration options and compiler
variants:
stack protector:
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0xbf: call to __stack_chk_fail() with UACCESS enabled
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0xbe: call to __stack_chk_fail() with UACCESS enabled
stackleak plugin:
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x4a: call to stackleak_track_stack() with UACCESS enabled
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x4a: call to stackleak_track_stack() with UACCESS enabled
kasan:
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x25: call to memcpy() with UACCESS enabled
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x25: call to memcpy() with UACCESS enabled
The stackleak and kasan options just need to be disabled for this file
as we do for other files already. For the stack protector, we already
attempt to disable it, but this fails on clang because the check is
mixed with the gcc specific -fno-conserve-stack option. According to
Andrey Ryabinin, that option is not even needed, dropping it here fixes
the stackprotector issue.
Link: http://lkml.kernel.org/r/20190722125139.1335385-1-arnd@arndb.de
Link: https://lore.kernel.org/lkml/20190617123109.667090-1-arnd@arndb.de/t/
Link: https://lore.kernel.org/lkml/20190722091050.2188664-1-arnd@arndb.de/t/
Fixes: d08965a27e84 ("x86/uaccess, ubsan: Fix UBSAN vs. SMAP")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d08965a27e84ca090b504844d50c24fc98587b11 upstream.
UBSAN can insert extra code in random locations; including AC=1
sections. Typically this code is not safe and needs wrapping.
So far, only __ubsan_handle_type_mismatch* have been observed in AC=1
sections and therefore only those are annotated.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[stable backport: only take the lib/Makefile change to resolve gcc-10
build issues]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5990cdee689c6885b27c6d969a3d58b09002b0bc ]
0day reports over and over on an powerpc randconfig with clang:
lib/mpi/generic_mpih-mul1.c:37:13: error: invalid use of a cast in a
inline asm context requiring an l-value: remove the cast or build with
-fheinous-gnu-extensions
Remove the superfluous casts, which have been done previously for x86
and arm32 in commit dea632cadd ("lib/mpi: fix build with clang") and
commit 7b7c1df2883d ("lib/mpi/longlong.h: fix building with 32-bit
x86").
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/991
Link: https://lore.kernel.org/r/20200413195041.24064-1-natechancellor@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 1ad3935b39da78a403e7df7a3813f866c731bc64 upstream.
Clang warns: vector initializers are not compatible with NEON intrinsics
in big endian mode [-Wnonportable-vector-initialization]
While this is usually the case, it's not an issue for this case since
we're initializing the uint8x16_t (16x uint8_t's) with the same value.
Instead, use vdupq_n_u8 which both compilers lower into a single movi
instruction: https://godbolt.org/z/vBrgzt
This avoids the static storage for a constant value.
Link: https://github.com/ClangBuiltLinux/linux/issues/214
Suggested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d5767057c9a76a29f073dad66b7fa12a90e8c748 ]
ext2_swab() is defined locally in lib/find_bit.c However it is not
specific to ext2, neither to bitmaps.
There are many potential users of it, so rename it to just swab() and
move to include/uapi/linux/swab.h
ABI guarantees that size of unsigned long corresponds to BITS_PER_LONG,
therefore drop unneeded cast.
Link: http://lkml.kernel.org/r/20200103202846.21616-1-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Joe Perches <joe@perches.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: William Breathitt Gray <vilhelm.gray@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
The tests for initializing a variable defined between a switch
statement's test and its first "case" statement are currently not
initialized in Clang[1] nor the proposed auto-initialization feature in
GCC.
We should retain the test (so that we can evaluate compiler fixes),
but mark it as an "expected fail". The rest of the kernel source will
be adjusted to avoid this corner case.
Also disable -Wswitch-unreachable for the test so that the intentionally
broken code won't trigger warnings for GCC (nor future Clang) when
initialization happens this unhandled place.
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Suggested-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
[adelva: cherry picking to avoid boot test flakes]
Bug: 144999193
Link: https://lore.kernel.org/lkml/202002191358.2897A07C6@keescook/
Change-Id: I0e691f2299ab42526ea306a92551a1188c469136
Signed-off-by: Alistair Delva <adelva@google.com>
[ Upstream commit 305e519ce48e935702c32241f07d393c3c8fed3e ]
Walter Wu has reported a potential case in which init_stack_slab() is
called after stack_slabs[STACK_ALLOC_MAX_SLABS - 1] has already been
initialized. In that case init_stack_slab() will overwrite
stack_slabs[STACK_ALLOC_MAX_SLABS], which may result in a memory
corruption.
Link: http://lkml.kernel.org/r/20200218102950.260263-1-glider@google.com
Fixes: cd11016e5f ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reported-by: Walter Wu <walter-zh.wu@mediatek.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4e456fee215677584cafa7f67298a76917e89c64 ]
Clang warns:
../lib/scatterlist.c:314:5: warning: misleading indentation; statement
is not part of the previous 'if' [-Wmisleading-indentation]
return -ENOMEM;
^
../lib/scatterlist.c:311:4: note: previous statement is here
if (prv)
^
1 warning generated.
This warning occurs because there is a space before the tab on this
line. Remove it so that the indentation is consistent with the Linux
kernel coding style and clang no longer warns.
Link: http://lkml.kernel.org/r/20191218033606.11942-1-natechancellor@gmail.com
Link: https://github.com/ClangBuiltLinux/linux/issues/830
Fixes: edce6820a9 ("scatterlist: prevent invalid free when alloc fails")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit ad67b74d2469 ("printk: hash addresses printed with %p") and
Commit ef0010a30935 ("vsprintf: don't use 'restricted_pointer()'
when not restricting") effectively removed the ability to display
kernel addresses in the kernel log. While this may be a useful feature
in production builds, it is undesirable when trying to debug.
%px is not a possible alternative, because it is unable to differentiate
between a debug and production build.
Change-Id: I139fae7b8488936d214efdd2b5b807fa1c005467
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
[vinmenon@codeaurora.org: fixed the Kconfig conflicts]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Instead, just fall back on the new '%p' behavior which hashes the
pointer.
Otherwise, '%pK' - that was intended to mark a pointer as restricted -
just ends up leaking pointers that a normal '%p' wouldn't leak. Which
just make the whole thing pointless.
I suspect we should actually get rid of '%pK' entirely, and make it just
work as '%p' regardless, but this is the minimal obvious fix. People
who actually use 'kptr_restrict' should weigh in on which behavior they
want.
Change-Id: I8d14e7ce80c41755f8b4ea7b2cf21a59a9c6973e
Cc: Tobin Harding <me@tobin.cc>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-Commit: ef0010a30935de4e0211cbc7bdffc30446cdee9b
Git-Repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Temporarily move the local outside of switch statement to work around
https://bugs.llvm.org/show_bug.cgi?id=44916 and unbreak the
ClangBuiltLinux project Travis build.
This patch has been tested according to the instructions at
https://github.com/ClangBuiltLinux/continuous-integration
Bug: 149581678
Test: ./driver.sh REPO=android-4.14
Change-Id: Iea85ee09d8f0e6d698b5f23a02fe23a6002ad5e7
Signed-off-by: Alexander Potapenko <glider@google.com>
commit 3e21d9a501bf99aee2e5835d7f34d8c823f115b5 upstream.
In case memory resources for _ptr2_ were allocated, release them before
return.
Notice that in case _ptr1_ happens to be NULL, krealloc() behaves
exactly like kmalloc().
Addresses-Coverity-ID: 1490594 ("Resource leak")
Link: http://lkml.kernel.org/r/20200123160115.GA4202@embeddedor
Fixes: 3f15801cdc ("lib: add kasan test module")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is a patch specific to android-common-4.14 where lib/test_meminit.c
and lib/test_stackinit.c are built into the kernel.
Call panic() to make the test failures more visible.
Bug: 144999193
Change-Id: Iaa9459f4bf63753c5ac1cf51c9691bf62b3924e9
Signed-off-by: Alexander Potapenko <glider@google.com>
Upstream commit 8c30d32b1a32 ("lib/test_stackinit: Handle Clang
auto-initialization pattern").
While the gcc plugin for automatic stack variable initialization (i.e.
CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL) performs initialization with
0x00 bytes, the Clang automatic stack variable initialization (i.e.
CONFIG_INIT_STACK_ALL) uses various type-specific patterns that are
typically 0xAA. Therefore the stackinit selftest has been fixed to check
that bytes are no longer the test fill pattern of 0xFF (instead of looking
for bytes that have become 0x00). This retains the test coverage for the
0x00 pattern of the gcc plugin while adding coverage for the mostly 0xAA
pattern of Clang.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Bug: 144999193
Change-Id: Ieb59dd5ba351d9ae96e5e3ad22f686be4524281a
Signed-off-by: Alexander Potapenko <glider@google.com>
With the realization that having debugfs enabled on "production" systems
is generally not a good idea, debugfs is being disabled from more and
more platforms over time. However, the functionality of dynamic
debugging still is needed at times, and since it relies on debugfs for
its user api, having debugfs disabled also forces dynamic debug to be
disabled.
To get around this, also create the "control" file for dynamic_debug in
procfs. This allows people turn on debugging as needed at runtime for
individual driverfs and subsystems.
Bug: 145162121
Reported-by: many different companies
Cc: Jason Baron <jbaron@akamai.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200210211142.GB1373304@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 239a5791ffd5559f51815df442c4dbbe7fc21ade)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Icd892ea823af6254726847700fd9c251d13b556b
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Bug: 145162121
Cc: linux-kernel@vger.kernel.org
Acked-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9fd714cd7f4676e8ff3f840911a8d64cacbeab8b)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I471385f3af3f96c767f59ae76ea0a115c9eb4f7a
There are in-tree users of atomic_dec_and_lock() which must acquire the
spin lock with interrupts disabled. To workaround the lack of an irqsave
variant of atomic_dec_and_lock() they use local_irq_save() at the call
site. This causes extra code and creates in some places unneeded long
interrupt disabled times. These places need also extra treatment for
PREEMPT_RT due to the disconnect of the irq disabling and the lock
function.
Implement the missing irqsave variant of the function.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r20180612161621.22645-3-bigeasy@linutronix.de
(cherry picked from commit ccfbb5bed407053b27492a9adc06064d949a9aa6)
BUG: 137270441
Change-Id: I49051801a73ac0de11d71bfa23930ce4826f7c61
Signed-off-by: Satya Tangirala <satyat@google.com>
commit c42b65e363ce97a828f81b59033c3558f8fa7f70 upstream.
A lot of code become ugly because of open coding allocations for bitmaps.
Introduce three helpers to allow users be more clear of intention
and keep their code neat.
Note, due to multiple circular dependencies we may not provide
the helpers as inliners. For now we keep them exported and, perhaps,
at some point in the future we will sort out header inclusion and
inheritance.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ab9bb6318b0967671e0c9b6537c1537d51ca4f45 ]
Commit dfe2a77fd2 ("kfifo: fix kfifo_alloc() and kfifo_init()") made
the kfifo code round the number of elements up. That was good for
__kfifo_alloc(), but it's actually wrong for __kfifo_init().
The difference? __kfifo_alloc() will allocate the rounded-up number of
elements, but __kfifo_init() uses an allocation done by the caller. We
can't just say "use more elements than the caller allocated", and have
to round down.
The good news? All the normal cases will be using power-of-two arrays
anyway, and most users of kfifo's don't use kfifo_init() at all, but one
of the helper macros to declare a KFIFO that enforce the proper
power-of-two behavior. But it looks like at least ibmvscsis might be
affected.
The bad news? Will Deacon refers to an old thread and points points out
that the memory ordering in kfifo's is questionable. See
https://lore.kernel.org/lkml/20181211034032.32338-1-yuleixzhang@tencent.com/
for more.
Fixes: dfe2a77fd2 ("kfifo: fix kfifo_alloc() and kfifo_init()")
Reported-by: laokz <laokz@foxmail.com>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Greg KH <greg@kroah.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>