Make sure we check the truncate constraints early on in ->setattr by adding
those checks to inode_change_ok. Also clean up and document inode_change_ok
to make this obvious.
As a fallout we don't have to call inode_newsize_ok from simple_setsize and
simplify it down to a truncate_setsize which doesn't return an error. This
simplifies a lot of setattr implementations and means we use truncate_setsize
almost everywhere. Get rid of fat_setsize now that it's trivial and mark
ext2_setsize static to make the calling convention obvious.
Keep the inode_newsize_ok in vmtruncate for now as all callers need an
audit for its removal anyway.
Note: setattr code in ecryptfs doesn't call inode_change_ok at all and
needs a deeper audit, but that is left for later.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pointed out by Lucas who found the new one in a comment in
setup_percpu.c. And then I fixed the others that I grepped
for.
Reported-by: Lucas <canolucas@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
jffs2 didn't update the ctime of the file when its permission was changed.
Steps to reproduce:
# touch aaa
# stat -c %Z aaa
1275289822
# setfacl -m 'u::x,g::x,o::x' aaa
# stat -c %Z aaa
1275289822 <- unchanged
But, according to the spec of the ctime, jffs2 must update it.
Port of ext3 patch by Miao Xie <miaox@cn.fujitsu.com>.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
New inodes need to be locked as we're creating them, so they don't get used
by other things (like NFSd) before they're ready.
Pointed out by Al Viro.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Lots of filesystems calls vmtruncate despite not implementing the old
->truncate method. Switch them to use simple_setsize and add some
comments about the truncate code where it seems fitting.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This is the culmination of this sequence of patches. By moving the block
erasing from jffs2_write_super() into the GC code, we avoid huge
latencies on unmount where it waits for _all_ pending blocks to be
erased, and we allow better control for time-critical tasks by stopping
the GC thread.
Signed-off-by: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Now that we do erases from GC and trigger the GC thread to do them
instead of using kupdated, this function is misnamed. It's only used
for triggering wbuf flush on NAND flash now. Rename it accordingly.
Signed-off-by: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We're about to call this from a bunch of places which already hold
c->erase_completion_lock, so add an assertion and change its existing
callers to do the same.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Now that we trigger block erases from jffs2_garbage_collect_pass(),
adjust jffs2_thread_should_wake() to return 1 when there are blocks to
erase.
Signed-off-by: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
jffs2_garbage_collect_pass() would previously return -EAGAIN if it
couldn't find anything to garbage collect from, and there were blocks on
the erase_pending_list. If the blocks were actually in the process of
being erased, though, then they wouldn't be on that list. Check for
nr_erasing_blocks being non-zero instead.
Fix jffs2_reserve_space() to wait for the in-progress erases to
complete, when jffs2_garbage_collect_pass() returns -EAGAIN.
And fix jffs2_erase_succeeded() to actually wake up the erase_wait wq
that jffs2_reserve_space() is now using.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We're about to start calling this from the jffs2_garbage_collect_pass(), and
we'll want to know whether it actually did anything or not.
Signed-off-by: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
jffs2 uses rb_node = NULL; to zero rb_root.
The problem with this is that 17d9ddc72f ("rbtree: Add
support for augmented rbtrees") in the linux-next tree adds a new field
to that struct which needs to be NULL as well. This patch uses RB_ROOT
as the intializer so all of the relevant fields will be NULL'd.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Eric Paris <eparis@redhat.com>
Acked-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a flags argument to struct xattr_handler and pass it to all xattr
handler methods. This allows using the same methods for multiple
handlers, e.g. for the ACL methods which perform exactly the same action
for the access and default ACLs, just using a different underlying
attribute. With a little more groundwork it'll also allow sharing the
methods for the regular user/trusted/secure handlers in extN, ocfs2 and
jffs2 like it's already done for xfs in this patch.
Also change the inode argument to the handlers to a dentry to allow
using the handlers mechnism for filesystems that require it later,
e.g. cifs.
[with GFS2 bits updated by Steven Whitehouse <swhiteho@redhat.com>]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ever since jffs2_garbage_collect_metadata() was first half-written in
February 2001, it's been broken on architectures where 'char' is signed.
When garbage collecting a symlink with target length above 127, the payload
length would end up negative, causing interesting and bad things to happen.
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In 2.6.23 kernel, commit a32ea1e1f9
("Fix read/truncate race") fixed a race in the generic code, and as a
side effect, now do_generic_file_read() can ask us to readpage() past
the i_size. This seems to be correctly handled by the block routines
(e.g. block_read_full_page() fills the page with zeroes in case if
somebody is trying to read past the last inode's block).
JFFS2 doesn't handle this; it assumes that it won't be asked to read
pages which don't exist -- and thus that there will be at least _one_
valid 'frag' on the page it's being asked to read. It will fill any
holes with the following memset:
memset(buf, 0, min(end, frag->ofs + frag->size) - offset);
When the 'closest smaller match' returned by jffs2_lookup_node_frag() is
actually on a previous page and ends before 'offset', that results in:
memset(buf, 0, <huge unsigned negative>);
Hopefully, in most cases the corruption is fatal, and quickly causing
random oopses, like this:
root@10.0.0.4:~/ltp-fs-20090531# ./testcases/kernel/fs/ftest/ftest01
Unable to handle kernel paging request for data at address 0x00000008
Faulting instruction address: 0xc01cd980
Oops: Kernel access of bad area, sig: 11 [#1]
[...]
NIP [c01cd980] rb_insert_color+0x38/0x184
LR [c0043978] enqueue_hrtimer+0x88/0xc4
Call Trace:
[c6c63b60] [c004f9a8] tick_sched_timer+0xa0/0xe4 (unreliable)
[c6c63b80] [c0043978] enqueue_hrtimer+0x88/0xc4
[c6c63b90] [c0043a48] __run_hrtimer+0x94/0xbc
[c6c63bb0] [c0044628] hrtimer_interrupt+0x140/0x2b8
[c6c63c10] [c000f8e8] timer_interrupt+0x13c/0x254
[c6c63c30] [c001352c] ret_from_except+0x0/0x14
--- Exception: 901 at memset+0x38/0x5c
LR = jffs2_read_inode_range+0x144/0x17c
[c6c63cf0] [00000000] (null) (unreliable)
This patch fixes the issue, plus fixes all LTP tests on NAND/UBI with
JFFS2 filesystem that were failing since 2.6.23 (seems like the bug
above also broke the truncation).
Reported-By: Anton Vorontsov <avorontsov@ru.mvista.com>
Tested-By: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MAX_SUMMARY_SIZE was meant as a limit, not as a minimum
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The returned error should stay negative
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We may end up doing DMA to/from these. Until the new MTD API fixes the
issues, this should stop things from falling over.
Original idea from Gilles Casse <list@gcasse.net>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This avoids an indirect call in the VFS for each path component lookup.
Well, at least as long as you own the directory in question, and the ACL
check is unnecessary.
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the jffs2 garbage collecting thread to the new kthread API.
Signed-off-by: Gerard Lledo <gerard.lledo@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The function jffs2_nor_wbuf_flash_setup() doesn't allocate the verify buffer
if CONFIG_JFFS2_FS_WBUF_VERIFY is defined, so causing a kernel panic when
that macro is enabled and the verify function is called. Similarly the
jffs2_nor_wbuf_flash_cleanup() must free the buffer if
CONFIG_JFFS2_FS_WBUF_VERIFY is enabled.
The following patch fixes the problem.
The following patch applies to 2.6.30 kernel.
Signed-off-by: Massimo Cirillo <maxcir@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT
This will make hardirq.h inclusion cheaper for every PREEMPT=n config
(which includes allmodconfig/allyesconfig, BTW)
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 5fd29d6ccb ("printk: clean up
handling of log-levels and newlines") changed printk semantics. printk
lines with multiple KERN_<level> prefixes are no longer emitted as
before the patch.
<level> is now included in the output on each additional use.
Remove all uses of multiple KERN_<level>s in formats.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
helpers: get_cached_acl(inode, type), set_cached_acl(inode, type, acl),
forget_cached_acl(inode, type).
ubifs/xattr.c needed includes reordered, the rest is a plain switchover.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In case of an error returned by file_dirty() 's' is not freed as the cleanup
path is skipped.
Reported by Coverity.
Signed-off-by: Christian Engelmayer <christian.engelmayer@frequentis.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The call to ->write_super from __sync_filesystem will go away, so make
sure jffs2 performs the same actions from inside ->sync_fs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Push down lock_super into ->write_super instances and remove it from the
caller.
Following filesystem don't need ->s_lock in ->write_super and are skipped:
* bfs, nilfs2 - no other uses of s_lock and have internal locks in
->write_super
* ext2 - uses BKL in ext2_write_super and has internal calls without s_lock
* reiserfs - no other uses of s_lock as has reiserfs_write_lock (BKL) in
->write_super
* xfs - no other uses of s_lock and uses internal lock (buffer lock on
superblock buffer) to serialize ->write_super. Also xfs_fs_write_super
is superflous and will go away in the next merge window
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
jffs2_write_super is only called from super.c and doesn't use any
functionality from fs.c. So move it over to super.c and make it
static there.
[should go in through the vfs tree as it is a requirement for the
next patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Move BKL into ->put_super from the only caller. A couple of
filesystems had trivial enough ->put_super (only kfree and NULLing of
s_fs_info + stuff in there) to not get any locking: coda, cramfs, efs,
hugetlbfs, omfs, qnx4, shmem, all others got the full treatment. Most
of them probably don't need it, but I'd rather sort that out individually.
Preferably after all the other BKL pushdowns in that area.
[AV: original used to move lock_super() down as well; these changes are
removed since we don't do lock_super() at all in generic_shutdown_super()
now]
[AV: fuse, btrfs and xfs are known to need no damn BKL, exempt]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We just did a full fs writeout using sync_filesystem before, and if
that's not enough for the filesystem it can perform it's own writeout
in ->put_super, which many filesystems already do.
Move a call to foofs_write_super into every foofs_put_super for now to
guarantee identical behaviour until it's cleaned up by the individual
filesystem maintainers.
Exceptions:
- affs already has identical copy & pasted code at the beginning of
affs_put_super so no need to do it twice.
- xfs does the right thing without it and I have changes pending for
the xfs tree touching this are so I don't really need conflicts
here..
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Erase errors such as:
"Newly-erased block contained word 0xa4ef223e at offset 0x0296a014"
and failure to write the clean marker,
moves the offending erase block to erasing list before calling
jffs2_erase_failed(). This is bad as jffs2_erase_failed() will
also move the block to the bad_list, but is now moving the
wrong block, causing FS corruption.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
size_t s is unsigned and cannot be less than 0.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>