kernel_samsung_sm7125

jenna

Author	SHA1	Message	Date
Sultan Alsawaf	5bbfc2cfe4	msm: kgsl: Remove sync fence names Generating a sync fence name by allocating memory dynamically and using scnprintf in a hot path results in excessive CPU time wasted on unneeded debug info. Remove the name generation entirely to cut down CPU waste in the GPU's rendering hot path. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	01a1e69bdf	msm: kgsl: Remove POPP POPP constantly attempts to lower the GPU's frequency behind the governor's back in order to save power; however, the GPU governor in use (msm-adreno-tz) is very good at determining the GPU's load and selecting an appropriate frequency to run the GPU at. POPP was created long ago, perhaps when msm-adreno-tz didn't exist or didn't work so well, so it is clearly deprecated. Remove it. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	fb364c393b	msm: kgsl: Wake GPU upon receiving an ioctl rather than upon touch input Waking the GPU upon touch wastes power when the screen is being touched in a way that does not induce animation or any actual need for GPU usage. Instead of preemptively waking the GPU on touch input, wake it up upon receiving a IOCTL_KGSL_GPU_COMMAND ioctl since it is a sign that the GPU will soon be needed. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	7829fc221b	msm: kgsl: Increase worker thread priority Currently, the kgsl worker thread is erroneously ranked right below Android's audio threads in terms of priority. The kgsl worker thread is in the critical path for rendering frames to the display, so increase its priority to match the priority of the display commit threads. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Tim Murray	9192837c82	cpuidle: don't disable cpuidle when entering suspend cpuidle was disabled while entering suspend as part of commit `8651f97bd9` in order to work around some ACPI bugs. However, there's no reason to do this on modern platforms. Leaving cpuidle enabled can result in improved power consumption if dpm_resume_noirq runs for a significant time. Change-Id: Ie182785b176f448698c0264eba554d1e315e8a06 Signed-off-by: Danny Lin <danny@kdrag0n.dev> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
NeilBrown	f8d2f076d5	VFS: use synchronize_rcu_expedited() in namespace_unlock() The synchronize_rcu() in namespace_unlock() is called every time a filesystem is unmounted. If a great many filesystems are mounted, this can cause a noticable slow-down in, for example, system shutdown. The sequence: mkdir -p /tmp/Mtest/{0..5000} time for i in /tmp/Mtest/; do mount -t tmpfs tmpfs $i ; done time umount /tmp/Mtest/ on a 4-cpu VM can report 8 seconds to mount the tmpfs filesystems, and 100 seconds to unmount them. Boot the same VM with 1 CPU and it takes 18 seconds to mount the tmpfs filesystems, but only 36 to unmount. If we change the synchronize_rcu() to synchronize_rcu_expedited() the umount time on a 4-cpu VM drop to 0.6 seconds I think this 200-fold speed up is worth the slightly high system impact of using synchronize_rcu_expedited(). Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> (from general rcu perspective) Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	e37af60502	mm: Don't stop kswapd on a per-node basis when there are no waiters The page allocator wakes all kswapds in an allocation context's allowed nodemask in the slow path, so it doesn't make sense to have the kswapd- waiter count per each NUMA node. Instead, it should be a global counter to stop all kswapds when there are no failed allocation requests. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	65b42b14d7	mm: vmpressure: Ignore allocation orders above PAGE_ALLOC_COSTLY_ORDER PAGE_ALLOC_COSTLY_ORDER allocations can cause vmpressure to incorrectly think that memory pressure is high, when it's really just that the allocation's high order is difficult to satisfy. When this rare scenario occurs, ignore the input to vmpressure to avoid sending out a spurious high-pressure signal. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	c7a8d93d3e	mm: Don't warn on page allocation failures for OOM-killed processes It can be normal for a dying process to have its page allocation request fail when it has an OOM or LMK kill pending. In this case, it's actually detrimental to print out a massive allocation failure message because this means the running process needs to die quickly and release its memory, which is slowed down slightly by the massive kmsg splat. The allocation failure message is also a false positive in this case, since the failure is intentional rather than being the result of an inability to allocate memory. Suppress the allocation failure warning for processes that are killed to release memory in order to expedite their death and remedy the kmsg confusion from seeing spurious allocation failure messages. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	5440e78367	mm: vmpressure: Don't cache the window size Caching the window size can result in delayed or inaccurate pressure reports. Since calculating a fresh window size is cheap, do so all the time instead of relying on a stale, cached value. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	26a1450ce9	mm: vmpressure: Interpret zero scanned pages as 100% pressure When no pages are scanned, it usually means no zones were reclaimable and nothing could be done. In this case, the reported pressure should be 100 to elicit help from any listeners. This fixes the vmpressure framework not working when memory pressure is very high. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	2641839440	mm: vmpressure: Don't exclude any allocation types Although userspace processes can't directly help with kernel memory pressure, killing userspace processes can relieve kernel memory if they are responsible for that pressure in the first place. It doesn't make sense to exclude any allocation types knowing that userspace can indeed affect all memory pressure, so don't exclude any allocation types from the pressure calculations. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sultan Alsawaf	f20b737ad5	mm: Stop kswapd early when nothing's waiting for it to free pages Keeping kswapd running when all the failed allocations that invoked it are satisfied incurs a high overhead due to unnecessary page eviction and writeback, as well as spurious VM pressure events to various registered shrinkers. When kswapd doesn't need to work to make an allocation succeed anymore, stop it prematurely to save resources. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Danny Lin	f8ae8f1f2b	techpack: audio: Remove build timestamps These are causing parts of techpack/audio to get rebuilt on every build for no reason. Signed-off-by: Danny Lin <danny@kdrag0n.dev> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Rajan Palaniya	81353f2949	ARM64: configs: sm7125: Enable erofs PCD Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
John Galt	786545ff97	erofs/zdata: modify set sched to use FIFO at high prio for lower latency Fixes: bdd668d3b54202 Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Sandeep Dhavale	ae900d3dfa	BACKPORT: FROMLIST: erofs: add per-cpu threads for decompression Using per-cpu thread pool we can reduce the scheduling latency compared to workqueue implementation. With this patch scheduling latency and variation is reduced as per-cpu threads are high priority kthread_workers. The results were evaluated on arm64 Android devices running 5.10 kernel. The table below shows resulting improvements of total scheduling latency for the same app launch benchmark runs with 50 iterations. Scheduling latency is the latency between when the task (workqueue kworker vs kthread_worker) became eligible to run to when it actually started running. +-------------------------+-----------+----------------+---------+ \| \| workqueue \| kthread_worker \| diff \| +-------------------------+-----------+----------------+---------+ \| Average (us) \| 15253 \| 2914 \| -80.89% \| \| Median (us) \| 14001 \| 2912 \| -79.20% \| \| Minimum (us) \| 3117 \| 1027 \| -67.05% \| \| Maximum (us) \| 30170 \| 3805 \| -87.39% \| \| Standard deviation (us) \| 7166 \| 359 \| \| +-------------------------+-----------+----------------+---------+ Background: Boot times and cold app launch benchmarks are very important to the android ecosystem as they directly translate to responsiveness from user point of view. While erofs provides a lot of important features like space savings, we saw some performance penalty in cold app launch benchmarks in few scenarios. Analysis showed that the significant variance was coming from the scheduling cost while decompression cost was more or less the same. Having per-cpu thread pool we can see from the above table that this variation is reduced by ~80% on average. This problem was discussed at LPC 2022. Link to LPC 2022 slides and talk at [1] [1] https://lpc.events/event/16/contributions/1338/ Link: https://lore.kernel.org/lkml/Y+DP6V9fZG7XPPGy@debian/ Change-Id: I454da5bc17f285d99047b93dc1fc70444f287156 Signed-off-by: Sandeep Dhavale <dhavale@google.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Cyber Knight	8ca4419d7d	lz4: Un-staticify some functions - This fixes the below warnings: In file included from ../fs/f2fs/super.c:28: ../include/linux/lz4.h:221:12: warning: 'LZ4_compress_fast' declared 'static' but never defined [-Wunused-function] 221 \| static int LZ4_compress_fast(const char source, char dest, int inputSize, \| ^~~~~~~~~~~~~~~~~ ../include/linux/lz4.h:245:12: warning: 'LZ4_compress_destSize' declared 'static' but never defined [-Wunused-function] 245 \| static int LZ4_compress_destSize(const char source, char dest, int sourceSizePtr, \| ^~~~~~~~~~~~~~~~~~~~~ ../include/linux/lz4.h:361:13: warning: 'LZ4_resetStreamHC' declared 'static' but never defined [-Wunused-function] 361 \| static void LZ4_resetStreamHC(LZ4_streamHC_t streamHCPtr, int compressionLevel); \| ^~~~~~~~~~~~~~~~~ ../include/linux/lz4.h:376:17: warning: 'LZ4_loadDictHC' declared 'static' but never defined [-Wunused-function] 376 \| static int LZ4_loadDictHC(LZ4_streamHC_t streamHCPtr, const char dictionary, \| ^~~~~~~~~~~~~~ ../include/linux/lz4.h:415:12: warning: 'LZ4_compress_HC_continue' declared 'static' but never defined [-Wunused-function] 415 \| static int LZ4_compress_HC_continue(LZ4_streamHC_t streamHCPtr, const char src, \| ^~~~~~~~~~~~~~~~~~~~~~~~ ../include/linux/lz4.h:434:12: warning: 'LZ4_saveDictHC' declared 'static' but never defined [-Wunused-function] 434 \| static int LZ4_saveDictHC(LZ4_streamHC_t streamHCPtr, char safeBuffer, \| ^~~~~~~~~~~~~~ ../include/linux/lz4.h:450:29: warning: 'LZ4_resetStream' declared 'static' but never defined [-Wunused-function] 450 \| static __always_inline void LZ4_resetStream(LZ4_stream_t LZ4_stream); \| ^~~~~~~~~~~~~~~ ../include/linux/lz4.h:507:12: warning: 'LZ4_compress_fast_continue' declared 'static' but never defined [-Wunused-function] 507 \| static int LZ4_compress_fast_continue(LZ4_stream_t streamPtr, const char src, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ ../include/linux/lz4.h:521:12: warning: 'LZ4_setStreamDecode' declared 'static' but never defined [-Wunused-function] 521 \| static int LZ4_setStreamDecode(LZ4_streamDecode_t LZ4_streamDecode, \| ^~~~~~~~~~~~~~~~~~~ ../include/linux/lz4.h:560:12: warning: 'LZ4_decompress_safe_continue' declared 'static' but never defined [-Wunused-function] 560 \| static int LZ4_decompress_safe_continue(LZ4_streamDecode_t LZ4_streamDecode, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../include/linux/lz4.h:599:12: warning: 'LZ4_decompress_fast_continue' declared 'static' but never defined [-Wunused-function] 599 \| static int LZ4_decompress_fast_continue(LZ4_streamDecode_t LZ4_streamDecode, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../include/linux/lz4.h:622:12: warning: 'LZ4_decompress_safe_usingDict' declared 'static' but never defined [-Wunused-function] 622 \| static int LZ4_decompress_safe_usingDict(const char source, char dest, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../include/linux/lz4.h:645:12: warning: 'LZ4_decompress_fast_usingDict' declared 'static' but never defined [-Wunused-function] 645 \| static int LZ4_decompress_fast_usingDict(const char source, char dest, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Cyber Knight	768ae8e611	include/linux: lz4: Reduce LZ4 memory usage to 16KB - 64KB seems to not behave well at high memory pressure hence let's reduce it to 16KB which is the default. Suggested-by: vantoman <mustafa.vantom@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Yue Hu	41efee5dac	erofs: do not prompt for risk any more when using big pcluster The big pcluster feature has been merged for a year, it has been mostly stable now. Signed-off-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220407050505.12683-1-huyue2@coolpad.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Rajan Palaniya	fe8e6abb84	defconfig: enable EROFS & crypto LZ4, LZ4HC Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Will Deacon	08762039e9	locking/barriers: Introduce smp_cond_load_relaxed() and atomic_cond_read_relaxed() Whilst we currently provide smp_cond_load_acquire() and atomic_cond_read_acquire(), there are cases where the ACQUIRE semantics are not required because of a subsequent fence or release operation once the conditional loop has exited. This patch adds relaxed versions of the conditional spinning primitives to avoid unnecessary barrier overhead on architectures such as arm64. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-2-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
John Galt	a8492570b4	Revert "erofs: compression fixes" This reverts commit 208dabff2d5e3e616a86df8bdba814d54b1a8a1f. Fixes a deadlock when fix shrinking erofs slab. Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Matthew Wilcox	a51cc59d1d	xarray: add the xa_lock to the radix_tree_root This results in no change in structure size on 64-bit machines as it fits in the padding between the gfp_t and the void *. 32-bit machines will grow the structure from 8 to 12 bytes. Almost all radix trees are protected with (at least) a spinlock, so as they are converted from radix trees to xarrays, the data structures will shrink again. Initialising the spinlock requires a name for the benefit of lockdep, so RADIX_TREE_INIT() now needs to know the name of the radix tree it's initialising, and so do IDR_INIT() and IDA_INIT(). Also add the xa_lock() and xa_unlock() family of wrappers to make it easier to use the lock. If we could rely on -fplan9-extensions in the compiler, we could avoid all of this syntactic sugar, but that wasn't added until gcc 4.6. Link: http://lkml.kernel.org/r/20180313132639.17387-8-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [@RealJohnGalt: adapt to 4.14] Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Matthew Wilcox	bb9af414d9	idr: Rename idr_for_each_entry_ext Most places in the kernel that we need to distinguish functions by the type of their arguments, we use '_ul' as a suffix for the unsigned long variant, not '_ext'. Also add kernel-doc. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Matthew Wilcox	0504aa5223	idr: Add idr_alloc_u32 helper All current users of idr_alloc_ext() actually want to allocate a u32 and idr_alloc_u32() fits their needs better. Like idr_get_next(), it uses a 'nextid' argument which serves as both a pointer to the start ID and the assigned ID (instead of a separate minimum and pointer-to-assigned-ID argument). It uses a 'max' argument rather than 'end' because the semantics that idr_alloc has for 'end' don't work well for unsigned types. Since idr_alloc_u32() returns an errno instead of the allocated ID, mark it as __must_check to help callers use it correctly. Include copious kernel-doc. Chris Mi <chrism@mellanox.com> has promised to contribute test-cases for idr_alloc_u32. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Matthew Wilcox	527a0f0a4b	fscache: use appropriate radix tree accessors Don't open-code accesses to data structure internals. Link: http://lkml.kernel.org/r/20180313132639.17387-7-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Matthew Wilcox	89269d5b4f	export __set_page_dirty XFS currently contains a copy-and-paste of __set_page_dirty(). Export it from buffer.c instead. Link: http://lkml.kernel.org/r/20180313132639.17387-6-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Dave Chinner <david@fromorbit.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Matthew Wilcox	587a7cb914	radix tree: use GFP_ZONEMASK bits of gfp_t for flags Patch series "XArray", v9. (First part thereof). This patchset is, I believe, appropriate for merging for 4.17. It contains the XArray implementation, to eventually replace the radix tree, and converts the page cache to use it. This conversion keeps the radix tree and XArray data structures in sync at all times. That allows us to convert the page cache one function at a time and should allow for easier bisection. Other than renaming some elements of the structures, the data structures are fundamentally unchanged; a radix tree walk and an XArray walk will touch the same number of cachelines. I have changes planned to the XArray data structure, but those will happen in future patches. Improvements the XArray has over the radix tree: - The radix tree provides operations like other trees do; 'insert' and 'delete'. But what most users really want is an automatically resizing array, and so it makes more sense to give users an API that is like an array -- 'load' and 'store'. We still have an 'insert' operation for users that really want that semantic. - The XArray considers locking as part of its API. This simplifies a lot of users who formerly had to manage their own locking just for the radix tree. It also improves code generation as we can now tell RCU that we're holding a lock and it doesn't need to generate as much fencing code. The other advantage is that tree nodes can be moved (not yet implemented). - GFP flags are now parameters to calls which may need to allocate memory. The radix tree forced users to decide what the allocation flags would be at creation time. It's much clearer to specify them at allocation time. - Memory is not preloaded; we don't tie up dozens of pages on the off chance that the slab allocator fails. Instead, we drop the lock, allocate a new node and retry the operation. We have to convert all the radix tree, IDA and IDR preload users before we can realise this benefit, but I have not yet found a user which cannot be converted. - The XArray provides a cmpxchg operation. The radix tree forces users to roll their own (and at least four have). - Iterators take a 'max' parameter. That simplifies many users and will reduce the amount of iteration done. - Iteration can proceed backwards. We only have one user for this, but since it's called as part of the pagefault readahead algorithm, that seemed worth mentioning. - RCU-protected pointers are not exposed as part of the API. There are some fun bugs where the page cache forgets to use rcu_dereference() in the current codebase. - Value entries gain an extra bit compared to radix tree exceptional entries. That gives us the extra bit we need to put huge page swap entries in the page cache. - Some iterators now take a 'filter' argument instead of having separate iterators for tagged/untagged iterations. The page cache is improved by this: - Shorter, easier to read code - More efficient iterations - Reduction in size of struct address_space - Fewer walks from the top of the data structure; the XArray API encourages staying at the leaf node and conducting operations there. This patch (of 8): None of these bits may be used for slab allocations, so we can use them as radix tree flags as long as we mask them off before passing them to the slab allocator. Move the IDR flag from the high bits to the GFP_ZONEMASK bits. Link: http://lkml.kernel.org/r/20180313132639.17387-3-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Jeff Layton <jlayton@kernel.org> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
John Galt	dd50145d0d	Revert "erofs: fixes for compilation" This reverts commit c7bf11979051cda0e7b37857289503fa4831c549. Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Hongyu Jin	2066046364	erofs: fix use-after-free of on-stack io[] The root cause is the race as follows: Thread #1 Thread #2(irq ctx) z_erofs_runqueue() struct z_erofs_decompressqueue io_A[]; submit bio A z_erofs_decompress_kickoff(,,1) z_erofs_decompressqueue_endio(bio A) z_erofs_decompress_kickoff(,,-1) spin_lock_irqsave() atomic_add_return() io_wait_event() -> pending_bios is already 0 [end of function] wake_up_locked(io_A[]) // crash Referenced backtrace in kernel 5.4: [ 10.129422] Unable to handle kernel paging request at virtual address eb0454a4 [ 10.364157] CPU: 0 PID: 709 Comm: getprop Tainted: G WC O 5.4.147-ab09225 #1 [ 11.556325] [<c01b33b8>] (__wake_up_common) from [<c01b3300>] (__wake_up_locked+0x40/0x48) [ 11.565487] [<c01b3300>] (__wake_up_locked) from [<c044c8d0>] (z_erofs_vle_unzip_kickoff+0x6c/0xc0) [ 11.575438] [<c044c8d0>] (z_erofs_vle_unzip_kickoff) from [<c044c854>] (z_erofs_vle_read_endio+0x16c/0x17c) [ 11.586082] [<c044c854>] (z_erofs_vle_read_endio) from [<c06a80e8>] (clone_endio+0xb4/0x1d0) [ 11.595428] [<c06a80e8>] (clone_endio) from [<c04a1280>] (blk_update_request+0x150/0x4dc) [ 11.604516] [<c04a1280>] (blk_update_request) from [<c06dea28>] (mmc_blk_cqe_complete_rq+0x144/0x15c) [ 11.614640] [<c06dea28>] (mmc_blk_cqe_complete_rq) from [<c04a5d90>] (blk_done_softirq+0xb0/0xcc) [ 11.624419] [<c04a5d90>] (blk_done_softirq) from [<c010242c>] (__do_softirq+0x184/0x56c) [ 11.633419] [<c010242c>] (__do_softirq) from [<c01051e8>] (irq_exit+0xd4/0x138) [ 11.641640] [<c01051e8>] (irq_exit) from [<c010c314>] (__handle_domain_irq+0x94/0xd0) [ 11.650381] [<c010c314>] (__handle_domain_irq) from [<c04fde70>] (gic_handle_irq+0x50/0xd4) [ 11.659641] [<c04fde70>] (gic_handle_irq) from [<c0101b70>] (__irq_svc+0x70/0xb0) Signed-off-by: Hongyu Jin <hongyu.jin@unisoc.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220401115527.4935-1-hongyu.jin.cn@gmail.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Yue Hu	6b1472d408	erofs: remove the fast path of per-CPU buffer decompression As Xiang mentioned, such path has no real impact to our current decompression strategy, remove it directly. Also, update the return value of z_erofs_lz4_decompress() to 0 if success to keep consistent with LZMA which will return 0 as well for that case. Link: https://lore.kernel.org/r/20211014065744.1787-1-zbestahu@gmail.com Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Yue Hu	2cb0e88d11	erofs: clear compacted_2b if compacted_4b_initial > totalidx Currently, the whole indexes will only be compacted 4B if compacted_4b_initial > totalidx. So, the calculated compacted_2b is worthless for that case. It may waste CPU resources. No need to update compacted_4b_initial as mkfs since it's used to fulfill the alignment of the 1st compacted_2b pack and would handle the case above. We also need to clarify compacted_4b_end here. It's used for the last lclusters which aren't fitted in the previous compacted_2b packs. Some messages are from Xiang. Link: https://lore.kernel.org/r/20210914035915.1190-1-zbestahu@gmail.com Signed-off-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> [ Gao Xiang: it's enough to use "compacted_4b_initial < totalidx". ] Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Yue Hu	61d993cd84	erofs: remove the mapping parameter from erofs_try_to_free_cached_page() The mapping is not used at all, remove it and update related code. Link: https://lore.kernel.org/r/20210810072416.1392-1-zbestahu@gmail.com Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Yue Hu	44cab08c6e	erofs: directly use wrapper erofs_page_is_managed() when shrinking We already have the wrapper function to identify managed page. Link: https://lore.kernel.org/r/20210810065450.1320-1-zbestahu@gmail.com Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Yue Hu	f167bbb5ae	erofs: remove the occupied parameter from z_erofs_pagevec_enqueue() No any behavior to variable occupied in z_erofs_attach_page() which is only caller to z_erofs_pagevec_enqueue(). Link: https://lore.kernel.org/r/20210419102623.2015-1-zbestahu@gmail.com Signed-off-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	ed2fe8a9ed	erofs: fix 1 lcluster-sized pcluster for big pcluster If the 1st NONHEAD lcluster of a pcluster isn't CBLKCNT lcluster type rather than a HEAD or PLAIN type instead, which means its pclustersize _must_ be 1 lcluster (since its uncompressed size < 2 lclusters), as illustrated below: HEAD HEAD / PLAIN lcluster type ____________ ____________ \|_:__________\|_________:__\| file data (uncompressed) . . .____________. \|____________\| pcluster data (compressed) Such on-disk case was explained before [1] but missed to be handled properly in the runtime implementation. It can be observed if manually generating 1 lcluster-sized pcluster with 2 lclusters (thus CBLKCNT doesn't exist.) Let's fix it now. [1] https://lore.kernel.org/r/20210407043927.10623-1-xiang@kernel.org Link: https://lore.kernel.org/r/20210510064715.29123-1-xiang@kernel.org Fixes: cec6e93beadf ("erofs: support parsing big pcluster compress indexes") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	43cceda554	erofs: enable big pcluster feature Enable COMPR_CFGS and BIG_PCLUSTER since the implementations are all settled properly. Link: https://lore.kernel.org/r/20210407043927.10623-11-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	6586571fd3	erofs: support decompress big pcluster for lz4 backend Prior to big pcluster, there was only one compressed page so it'd easy to map this. However, when big pcluster is enabled, more work needs to be done to handle multiple compressed pages. In detail, - (maptype 0) if there is only one compressed page + no need to copy inplace I/O, just map it directly what we did before; - (maptype 1) if there are more compressed pages + no need to copy inplace I/O, vmap such compressed pages instead; - (maptype 2) if inplace I/O needs to be copied, use per-CPU buffers for decompression then. Another thing is how to detect inplace decompression is feasable or not (it's still quite easy for non big pclusters), apart from the inplace margin calculation, inplace I/O page reusing order is also needed to be considered for each compressed page. Currently, if the compressed page is the xth page, it shouldn't be reused as [0 ... nrpages_out - nrpages_in + x], otherwise a full copy will be triggered. Although there are some extra optimization ideas for this, I'd like to make big pcluster work correctly first and obviously it can be further optimized later since it has nothing with the on-disk format at all. Link: https://lore.kernel.org/r/20210407043927.10623-10-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	be62c54af8	erofs: support parsing big pcluster compact indexes Different from non-compact indexes, several lclusters are packed as the compact form at once and an unique base blkaddr is stored for each pack, so each lcluster index would take less space on avarage (e.g. 2 bytes for COMPACT_2B.) btw, that is also why BIG_PCLUSTER switch should be consistent for compact head0/1. Prior to big pcluster, the size of all pclusters was 1 lcluster. Therefore, when a new HEAD lcluster was scanned, blkaddr would be bumped by 1 lcluster. However, that way doesn't work anymore for big pcluster since we actually don't know the compressed size of pclusters in advance (before reading CBLKCNT lcluster). So, instead, let blkaddr of each pack be the first pcluster blkaddr with a valid CBLKCNT, in detail, 1) if CBLKCNT starts at the pack, this first valid pcluster is itself, e.g. _____________________________________________________________ \|_CBLKCNT0_\|_NONHEAD_\| .. \|_HEAD_\|_CBLKCNT1_\| ... \|_HEAD_\| ... ^ = blkaddr base ^ += CBLKCNT0 ^ += CBLKCNT1 2) if CBLKCNT doesn't start at the pack, the first valid pcluster is the next pcluster, e.g. _________________________________________________________ \| NONHEAD_\| .. \|_HEAD_\|_CBLKCNT0_\| ... \|_HEAD_\|_HEAD_\| ... ^ = blkaddr base ^ += CBLKCNT0 ^ += 1 When a CBLKCNT is found, blkaddr will be increased by CBLKCNT lclusters, or a new HEAD is found immediately, bump blkaddr by 1 instead (see the picture above.) Also noted if CBLKCNT is the end of the pack, instead of storing delta1 (distance of the next HEAD lcluster) as normal NONHEADs, it still uses the compressed block count (delta0) since delta1 can be calculated indirectly but the block count can't. Adjust decoding logic to fit big pcluster compact indexes as well. Link: https://lore.kernel.org/r/20210407043927.10623-9-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	350dbfb229	erofs: support parsing big pcluster compress indexes When INCOMPAT_BIG_PCLUSTER sb feature is enabled, legacy compress indexes will also have the same on-disk header compact indexes to keep per-file configurations instead of leaving it zeroed. If ADVISE_BIG_PCLUSTER is set for a file, CBLKCNT will be loaded for each pcluster in this file by parsing 1st non-head lcluster. Link: https://lore.kernel.org/r/20210407043927.10623-8-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	e375c31c91	erofs: adjust per-CPU buffers according to max_pclusterblks Adjust per-CPU buffers on demand since big pcluster definition is available. Also, bail out unsupported pcluster size according to Z_EROFS_PCLUSTER_MAX_SIZE. Link: https://lore.kernel.org/r/20210407043927.10623-7-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	7d8b39a05a	erofs: add big physical cluster definition Big pcluster indicates the size of compressed data for each physical pcluster is no longer fixed as block size, but could be more than 1 block (more accurately, 1 logical pcluster) When big pcluster feature is enabled for head0/1, delta0 of the 1st non-head lcluster index will keep block count of this pcluster in lcluster size instead of 1. Or, the compressed size of pcluster should be 1 lcluster if pcluster has no non-head lcluster index. Also note that BIG_PCLUSTER feature reuses COMPR_CFGS feature since it depends on COMPR_CFGS and will be released together. Link: https://lore.kernel.org/r/20210407043927.10623-6-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	bb44c84e33	erofs: fix up inplace I/O pointer for big pcluster When picking up inplace I/O pages, it should be traversed in reverse order in aligned with the traversal order of file-backed online pages. Also, index should be updated together when preloading compressed pages. Previously, only page-sized pclustersize was supported so no problem at all. Also rename `compressedpages' to `icpage_ptr' to reflect its functionality. Link: https://lore.kernel.org/r/20210407043927.10623-5-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	b9a0b2b8fb	erofs: introduce physical cluster slab pools Since multiple pcluster sizes could be used at once, the number of compressed pages will become a variable factor. It's necessary to introduce slab pools rather than a single slab cache now. This limits the pclustersize to 1M (Z_EROFS_PCLUSTER_MAX_SIZE), and get rid of the obsolete EROFS_FS_CLUSTER_PAGE_LIMIT, which has no use now. Link: https://lore.kernel.org/r/20210407043927.10623-4-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	2e331bed95	erofs: introduce multipage per-CPU buffers To deal the with the cases which inplace decompression is infeasible for some inplace I/O. Per-CPU buffers was introduced to get rid of page allocation latency and thrash for low-latency decompression algorithms such as lz4. For the big pcluster feature, introduce multipage per-CPU buffers to keep such inplace I/O pclusters temporarily as well but note that per-CPU pages are just consecutive virtually. When a new big pcluster fs is mounted, its max pclustersize will be read and per-CPU buffers can be growed if needed. Shrinking adjustable per-CPU buffers is more complex (because we don't know if such size is still be used), so currently just release them all when unloading. Link: https://lore.kernel.org/r/20210409190630.19569-1-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	21e555b866	erofs: reserve physical_clusterbits[] Formal big pcluster design is actually more powerful / flexable than the previous thought whose pclustersize was fixed as power-of-2 blocks, which was obviously inefficient and space-wasting. Instead, pclustersize can now be set independently for each pcluster, so various pcluster sizes can also be used together in one file if mkfs wants (for example, according to data type and/or compression ratio). Let's get rid of previous physical_clusterbits[] setting (also notice that corresponding on-disk fields are still 0 for now). Therefore, head1/2 can be used for at most 2 different algorithms in one file and again pclustersize is now independent of these. Link: https://lore.kernel.org/r/20210407043927.10623-2-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Ruiqi Gong	f3c36570ab	erofs: Clean up spelling mistakes found in fs/erofs zmap.c: s/correspoinding/corresponding zdata.c: s/endding/ending Link: https://lore.kernel.org/r/20210331093920.31923-1-gongruiqi1@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com> Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	959f65fd84	erofs: add on-disk compression configurations Add a bitmap for available compression algorithms and a variable-sized on-disk table for compression options in preparation for upcoming big pcluster and LZMA algorithm, which follows the end of super block. To parse the compression options, the bitmap is scanned one by one. For each available algorithm, there is data followed by 2-byte `length' correspondingly (it's enough for most cases, or entire fs blocks should be used.) With such available algorithm bitmap, kernel itself can also refuse to mount such filesystem if any unsupported compression algorithm exists. Note that COMPR_CFGS feature will be enabled with BIG_PCLUSTER. Link: https://lore.kernel.org/r/20210329100012.12980-1-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago
Gao Xiang	a7627eba29	erofs: introduce on-disk lz4 fs configurations Introduce z_erofs_lz4_cfgs to store all lz4 configurations. Currently it's only max_distance, but will be used for new features later. Link: https://lore.kernel.org/r/20210329012308.28743-4-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Ruchit <ruchitmarathe@gmail.com>	6 months ago

1 2 3 4 5 ...

762081 Commits (919cc2ab499e7ea153a37171126cecb1487ed691) All Branches Search

762081 Commits (919cc2ab499e7ea153a37171126cecb1487ed691)

All Branches