packages (LINUX_3_0): kernel/kernel-small_fixes.patch - xfs stable fixes qu...
arekm
arekm at pld-linux.org
Sat Nov 19 20:36:22 CET 2011
Author: arekm Date: Sat Nov 19 19:36:22 2011 GMT
Module: packages Tag: LINUX_3_0
---- Log message:
- xfs stable fixes queued for 3.0.11
---- Files affected:
packages/kernel:
kernel-small_fixes.patch (1.43.2.5 -> 1.43.2.6)
---- Diffs:
================================================================
Index: packages/kernel/kernel-small_fixes.patch
diff -u packages/kernel/kernel-small_fixes.patch:1.43.2.5 packages/kernel/kernel-small_fixes.patch:1.43.2.6
--- packages/kernel/kernel-small_fixes.patch:1.43.2.5 Fri Nov 18 10:24:43 2011
+++ packages/kernel/kernel-small_fixes.patch Sat Nov 19 20:36:16 2011
@@ -348,64 +348,6 @@
exit
fi
done
-commit 37b652ec6445be99d0193047d1eda129a1a315d3
-Author: Dave Chinner <dchinner at redhat.com>
-Date: Thu Aug 25 07:17:01 2011 +0000
-
- xfs: don't serialise direct IO reads on page cache checks
-
- There is no need to grab the i_mutex of the IO lock in exclusive
- mode if we don't need to invalidate the page cache. Taking these
- locks on every direct IO effective serialises them as taking the IO
- lock in exclusive mode has to wait for all shared holders to drop
- the lock. That only happens when IO is complete, so effective it
- prevents dispatch of concurrent direct IO reads to the same inode.
-
- Fix this by taking the IO lock shared to check the page cache state,
- and only then drop it and take the IO lock exclusively if there is
- work to be done. Hence for the normal direct IO case, no exclusive
- locking will occur.
-
- Signed-off-by: Dave Chinner <dchinner at redhat.com>
- Tested-by: Joern Engel <joern at logfs.org>
- Reviewed-by: Christoph Hellwig <hch at lst.de>
- Signed-off-by: Alex Elder <aelder at sgi.com>
-
-diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c
-index 7f7b424..8fd4a07 100644
---- a/fs/xfs/linux-2.6/xfs_file.c
-+++ b/fs/xfs/linux-2.6/xfs_file.c
-@@ -317,7 +317,19 @@ xfs_file_aio_read(
- if (XFS_FORCED_SHUTDOWN(mp))
- return -EIO;
-
-- if (unlikely(ioflags & IO_ISDIRECT)) {
-+ /*
-+ * Locking is a bit tricky here. If we take an exclusive lock
-+ * for direct IO, we effectively serialise all new concurrent
-+ * read IO to this file and block it behind IO that is currently in
-+ * progress because IO in progress holds the IO lock shared. We only
-+ * need to hold the lock exclusive to blow away the page cache, so
-+ * only take lock exclusively if the page cache needs invalidation.
-+ * This allows the normal direct IO case of no page cache pages to
-+ * proceeed concurrently without serialisation.
-+ */
-+ xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
-+ if ((ioflags & IO_ISDIRECT) && inode->i_mapping->nrpages) {
-+ xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED);
- xfs_rw_ilock(ip, XFS_IOLOCK_EXCL);
-
- if (inode->i_mapping->nrpages) {
-@@ -330,8 +342,7 @@ xfs_file_aio_read(
- }
- }
- xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL);
-- } else
-- xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
-+ }
-
- trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags);
-
An integer overflow will happen on 64bit archs if task's sum of rss, swapents
and nr_ptes exceeds (2^31)/1000 value. This was introduced by commit
@@ -993,3 +935,395 @@
}
/*
+Subject: [PATCH 1/9] [PATCH 1/9] "xfs: fix error handling for synchronous
+ writes"
+
+xfs: fix for hang during synchronous buffer write error
+
+If removed storage while synchronous buffer write underway,
+"xfslogd" hangs.
+
+Detailed log http://oss.sgi.com/archives/xfs/2011-07/msg00740.html
+
+Related work bfc60177f8ab509bc225becbb58f7e53a0e33e81
+"xfs: fix error handling for synchronous writes"
+
+Given that xfs_bwrite actually does the shutdown already after
+waiting for the b_iodone completion and given that we actually
+found that calling xfs_force_shutdown from inside
+xfs_buf_iodone_callbacks was a major contributor the problem
+it better to drop this call.
+
+Signed-off-by: Ajeet Yadav <ajeet.yadav.77 at gmail.com>
+Reviewed-by: Christoph Hellwig <hch at lst.de>
+Signed-off-by: Alex Elder <aelder at sgi.com>
+---
+ fs/xfs/xfs_buf_item.c | 1 -
+ 1 files changed, 0 insertions(+), 1 deletions(-)
+
+diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
+index a7342e8..7888a75 100644
+--- a/fs/xfs/xfs_buf_item.c
++++ b/fs/xfs/xfs_buf_item.c
+@@ -1023,7 +1023,6 @@ xfs_buf_iodone_callbacks(
+ XFS_BUF_UNDELAYWRITE(bp);
+
+ trace_xfs_buf_error_relse(bp, _RET_IP_);
+- xfs_force_shutdown(mp, SHUTDOWN_META_IO_ERROR);
+
+ do_callbacks:
+ xfs_buf_do_callbacks(bp);
+--
+1.7.7
+
+
+_______________________________________________
+xfs mailing list
+xfs at oss.sgi.com
+http://oss.sgi.com/mailman/listinfo/xfs
+Subject: [PATCH 2/9] [PATCH 2/9] xfs: fix xfs_mark_inode_dirty during umount
+
+During umount we do not add a dirty inode to the lru and wait for it to
+become clean first, but force writeback of data and metadata with
+I_WILL_FREE set. Currently there is no way for XFS to detect that the
+inode has been redirtied for metadata operations, as we skip the
+mark_inode_dirty call during teardown. Fix this by setting i_update_core
+nanually in that case, so that the inode gets flushed during inode reclaim.
+
+Alternatively we could enable calling mark_inode_dirty for inodes in
+I_WILL_FREE state, and let the VFS dirty tracking handle this. I decided
+against this as we will get better I/O patterns from reclaim compared to
+the synchronous writeout in write_inode_now, and always marking the inode
+dirty in some way from xfs_mark_inode_dirty is a better safetly net in
+either case.
+
+Signed-off-by: Christoph Hellwig <hch at lst.de>
+Reviewed-by: Dave Chinner <dchinner at redhat.com>
+Signed-off-by: Alex Elder <aelder at sgi.com>
+(cherry picked from commit da6742a5a4cc844a9982fdd936ddb537c0747856)
+
+Signed-off-by: Alex Elder <aelder at sgi.com>
+---
+ fs/xfs/linux-2.6/xfs_iops.c | 14 +++++++++++---
+ 1 files changed, 11 insertions(+), 3 deletions(-)
+
+diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
+index d44d92c..a9b3e1e 100644
+--- a/fs/xfs/linux-2.6/xfs_iops.c
++++ b/fs/xfs/linux-2.6/xfs_iops.c
+@@ -69,9 +69,8 @@ xfs_synchronize_times(
+ }
+
+ /*
+- * If the linux inode is valid, mark it dirty.
+- * Used when committing a dirty inode into a transaction so that
+- * the inode will get written back by the linux code
++ * If the linux inode is valid, mark it dirty, else mark the dirty state
++ * in the XFS inode to make sure we pick it up when reclaiming the inode.
+ */
+ void
+ xfs_mark_inode_dirty_sync(
+@@ -81,6 +80,10 @@ xfs_mark_inode_dirty_sync(
+
+ if (!(inode->i_state & (I_WILL_FREE|I_FREEING)))
+ mark_inode_dirty_sync(inode);
++ else {
++ barrier();
++ ip->i_update_core = 1;
++ }
+ }
+
+ void
+@@ -91,6 +94,11 @@ xfs_mark_inode_dirty(
+
+ if (!(inode->i_state & (I_WILL_FREE|I_FREEING)))
+ mark_inode_dirty(inode);
++ else {
++ barrier();
++ ip->i_update_core = 1;
++ }
++
+ }
+
+ /*
+--
+1.7.7
+
+
+_______________________________________________
+xfs mailing list
+xfs at oss.sgi.com
+http://oss.sgi.com/mailman/listinfo/xfs
+Subject: [PATCH 3/9] [PATCH 3/9] xfs: fix ->write_inode return values
+
+Currently we always redirty an inode that was attempted to be written out
+synchronously but has been cleaned by an AIL pushed internall, which is
+rather bogus. Fix that by doing the i_update_core check early on and
+return 0 for it. Also include async calls for it, as doing any work for
+those is just as pointless. While we're at it also fix the sign for the
+EIO return in case of a filesystem shutdown, and fix the completely
+non-sensical locking around xfs_log_inode.
+
+Signed-off-by: Christoph Hellwig <hch at lst.de>
+Reviewed-by: Dave Chinner <dchinner at redhat.com>
+Signed-off-by: Alex Elder <aelder at sgi.com>
+(cherry picked from commit 297db93bb74cf687510313eb235a7aec14d67e97)
+
+Signed-off-by: Alex Elder <aelder at sgi.com>
+---
+ fs/xfs/linux-2.6/xfs_super.c | 34 +++++++++-------------------------
+ 1 files changed, 9 insertions(+), 25 deletions(-)
+
+diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
+index 347cae9..28de70b 100644
+--- a/fs/xfs/linux-2.6/xfs_super.c
++++ b/fs/xfs/linux-2.6/xfs_super.c
+@@ -878,33 +878,17 @@ xfs_log_inode(
+ struct xfs_trans *tp;
+ int error;
+
+- xfs_iunlock(ip, XFS_ILOCK_SHARED);
+ tp = xfs_trans_alloc(mp, XFS_TRANS_FSYNC_TS);
+ error = xfs_trans_reserve(tp, 0, XFS_FSYNC_TS_LOG_RES(mp), 0, 0, 0);
+-
+ if (error) {
+ xfs_trans_cancel(tp, 0);
+- /* we need to return with the lock hold shared */
+- xfs_ilock(ip, XFS_ILOCK_SHARED);
+ return error;
+ }
+
+ xfs_ilock(ip, XFS_ILOCK_EXCL);
+-
+- /*
+- * Note - it's possible that we might have pushed ourselves out of the
+- * way during trans_reserve which would flush the inode. But there's
+- * no guarantee that the inode buffer has actually gone out yet (it's
+- * delwri). Plus the buffer could be pinned anyway if it's part of
+- * an inode in another recent transaction. So we play it safe and
+- * fire off the transaction anyway.
+- */
+- xfs_trans_ijoin(tp, ip);
++ xfs_trans_ijoin_ref(tp, ip, XFS_ILOCK_EXCL);
+ xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+- error = xfs_trans_commit(tp, 0);
+- xfs_ilock_demote(ip, XFS_ILOCK_EXCL);
+-
+- return error;
++ return xfs_trans_commit(tp, 0);
+ }
+
+ STATIC int
+@@ -919,7 +903,9 @@ xfs_fs_write_inode(
+ trace_xfs_write_inode(ip);
+
+ if (XFS_FORCED_SHUTDOWN(mp))
+- return XFS_ERROR(EIO);
++ return -XFS_ERROR(EIO);
++ if (!ip->i_update_core)
++ return 0;
+
+ if (wbc->sync_mode == WB_SYNC_ALL) {
+ /*
+@@ -930,12 +916,10 @@ xfs_fs_write_inode(
+ * of synchronous log foces dramatically.
+ */
+ xfs_ioend_wait(ip);
+- xfs_ilock(ip, XFS_ILOCK_SHARED);
+- if (ip->i_update_core) {
+- error = xfs_log_inode(ip);
+- if (error)
+- goto out_unlock;
+- }
++ error = xfs_log_inode(ip);
++ if (error)
++ goto out;
++ return 0;
+ } else {
+ /*
+ * We make this non-blocking if the inode is contended, return
+--
+1.7.7
+
+
+_______________________________________________
+xfs mailing list
+xfs at oss.sgi.com
+http://oss.sgi.com/mailman/listinfo/xfs
+Subject: [PATCH 4/9] [PATCH 4/9] xfs: dont serialise direct IO reads on page
+ cache
+
+There is no need to grab the i_mutex of the IO lock in exclusive
+mode if we don't need to invalidate the page cache. Taking these
+locks on every direct IO effective serialises them as taking the IO
+lock in exclusive mode has to wait for all shared holders to drop
+the lock. That only happens when IO is complete, so effective it
+prevents dispatch of concurrent direct IO reads to the same inode.
+
+Fix this by taking the IO lock shared to check the page cache state,
+and only then drop it and take the IO lock exclusively if there is
+work to be done. Hence for the normal direct IO case, no exclusive
+locking will occur.
+
+Signed-off-by: Dave Chinner <dchinner at redhat.com>
+Tested-by: Joern Engel <joern at logfs.org>
+Reviewed-by: Christoph Hellwig <hch at lst.de>
+Signed-off-by: Alex Elder <aelder at sgi.com>
+---
+ fs/xfs/linux-2.6/xfs_file.c | 17 ++++++++++++++---
+ 1 files changed, 14 insertions(+), 3 deletions(-)
+
+diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c
+index 7f782af2..93cc02d 100644
+--- a/fs/xfs/linux-2.6/xfs_file.c
++++ b/fs/xfs/linux-2.6/xfs_file.c
+@@ -309,7 +309,19 @@ xfs_file_aio_read(
+ if (XFS_FORCED_SHUTDOWN(mp))
+ return -EIO;
+
+- if (unlikely(ioflags & IO_ISDIRECT)) {
++ /*
++ * Locking is a bit tricky here. If we take an exclusive lock
++ * for direct IO, we effectively serialise all new concurrent
++ * read IO to this file and block it behind IO that is currently in
++ * progress because IO in progress holds the IO lock shared. We only
++ * need to hold the lock exclusive to blow away the page cache, so
++ * only take lock exclusively if the page cache needs invalidation.
++ * This allows the normal direct IO case of no page cache pages to
++ * proceeed concurrently without serialisation.
++ */
++ xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
++ if ((ioflags & IO_ISDIRECT) && inode->i_mapping->nrpages) {
++ xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED);
+ xfs_rw_ilock(ip, XFS_IOLOCK_EXCL);
+
+ if (inode->i_mapping->nrpages) {
+@@ -322,8 +334,7 @@ xfs_file_aio_read(
+ }
+ }
+ xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL);
+- } else
+- xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
++ }
+
+ trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags);
+
+--
+1.7.7
+
+
+_______________________________________________
+xfs mailing list
+xfs at oss.sgi.com
+http://oss.sgi.com/mailman/listinfo/xfs
+Subject: [PATCH 5/9] [PATCH 5/9] xfs: avoid direct I/O write vs buffered I/O
+ race
+
+Currently a buffered reader or writer can add pages to the pagecache
+while we are waiting for the iolock in xfs_file_dio_aio_write. Prevent
+this by re-checking mapping->nrpages after we got the iolock, and if
+nessecary upgrade the lock to exclusive mode. To simplify this a bit
+only take the ilock inside of xfs_file_aio_write_checks.
+
+Signed-off-by: Christoph Hellwig <hch at lst.de>
+Reviewed-by: Dave Chinner <dchinner at redhat.com>
+Signed-off-by: Alex Elder <aelder at sgi.com>
+---
+ fs/xfs/linux-2.6/xfs_file.c | 17 ++++++++++++++---
+ 1 files changed, 14 insertions(+), 3 deletions(-)
+
+diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c
+index 93cc02d..b679198 100644
+--- a/fs/xfs/linux-2.6/xfs_file.c
++++ b/fs/xfs/linux-2.6/xfs_file.c
+@@ -669,6 +669,7 @@ xfs_file_aio_write_checks(
+ xfs_fsize_t new_size;
+ int error = 0;
+
++ xfs_rw_ilock(ip, XFS_ILOCK_EXCL);
+ error = generic_write_checks(file, pos, count, S_ISBLK(inode->i_mode));
+ if (error) {
+ xfs_rw_iunlock(ip, XFS_ILOCK_EXCL | *iolock);
+@@ -760,14 +761,24 @@ xfs_file_dio_aio_write(
+ *iolock = XFS_IOLOCK_EXCL;
+ else
+ *iolock = XFS_IOLOCK_SHARED;
+- xfs_rw_ilock(ip, XFS_ILOCK_EXCL | *iolock);
++ xfs_rw_ilock(ip, *iolock);
+
+ ret = xfs_file_aio_write_checks(file, &pos, &count, iolock);
+ if (ret)
+ return ret;
+
++ /*
++ * Recheck if there are cached pages that need invalidate after we got
++ * the iolock to protect against other threads adding new pages while
++ * we were waiting for the iolock.
++ */
++ if (mapping->nrpages && *iolock == XFS_IOLOCK_SHARED) {
++ xfs_rw_iunlock(ip, *iolock);
++ *iolock = XFS_IOLOCK_EXCL;
++ xfs_rw_ilock(ip, *iolock);
++ }
++
+ if (mapping->nrpages) {
+- WARN_ON(*iolock != XFS_IOLOCK_EXCL);
+ ret = -xfs_flushinval_pages(ip, (pos & PAGE_CACHE_MASK), -1,
+ FI_REMAPF_LOCKED);
+ if (ret)
+@@ -812,7 +823,7 @@ xfs_file_buffered_aio_write(
+ size_t count = ocount;
+
+ *iolock = XFS_IOLOCK_EXCL;
+- xfs_rw_ilock(ip, XFS_ILOCK_EXCL | *iolock);
++ xfs_rw_ilock(ip, *iolock);
+
+ ret = xfs_file_aio_write_checks(file, &pos, &count, iolock);
+ if (ret)
+--
+1.7.7
+
+
+_______________________________________________
+xfs mailing list
+xfs at oss.sgi.com
+http://oss.sgi.com/mailman/listinfo/xfs
+Subject: [PATCH 6/9] [PATCH 6/9] xfs: Return -EIO when xfs_vn_getattr() failed
+
+An attribute of inode can be fetched via xfs_vn_getattr() in XFS.
+Currently it returns EIO, not negative value, when it failed. As a
+result, the system call returns not negative value even though an
+error occured. The stat(2), ls and mv commands cannot handle this
+error and do not work correctly.
+
+This patch fixes this bug, and returns -EIO, not EIO when an error
+is detected in xfs_vn_getattr().
+
+Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu at hitachi.com>
+Reviewed-by: Christoph Hellwig <hch at lst.de>
+Signed-off-by: Alex Elder <aelder at sgi.com>
+---
+ fs/xfs/linux-2.6/xfs_iops.c | 2 +-
+ 1 files changed, 1 insertions(+), 1 deletions(-)
+
+diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
+index a9b3e1e..f5b697b 100644
+--- a/fs/xfs/linux-2.6/xfs_iops.c
++++ b/fs/xfs/linux-2.6/xfs_iops.c
+@@ -464,7 +464,7 @@ xfs_vn_getattr(
+ trace_xfs_getattr(ip);
+
+ if (XFS_FORCED_SHUTDOWN(mp))
+- return XFS_ERROR(EIO);
++ return -XFS_ERROR(EIO);
+
+ stat->size = XFS_ISIZE(ip);
+ stat->dev = inode->i_sb->s_dev;
+--
+1.7.7
+
+
+_______________________________________________
+xfs mailing list
+xfs at oss.sgi.com
+http://oss.sgi.com/mailman/listinfo/xfs
================================================================
---- CVS-web:
http://cvs.pld-linux.org/cgi-bin/cvsweb.cgi/packages/kernel/kernel-small_fixes.patch?r1=1.43.2.5&r2=1.43.2.6&f=u
More information about the pld-cvs-commit
mailing list