[packages/kernel/LINUX_3_2] - raid6 corruption fix in case of 2 disks degradation

arekm arekm at pld-linux.org
Mon Aug 18 11:22:41 CEST 2014


commit df28e0131a913b1086c2eef4e7ac7d454652487d
Author: Arkadiusz Miśkiewicz <arekm at maven.pl>
Date:   Mon Aug 18 11:22:36 2014 +0200

    - raid6 corruption fix in case of 2 disks degradation

 kernel-small_fixes.patch | 70 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)
---
diff --git a/kernel-small_fixes.patch b/kernel-small_fixes.patch
index 11080f3..10804ab 100644
--- a/kernel-small_fixes.patch
+++ b/kernel-small_fixes.patch
@@ -228,3 +228,73 @@ index 7a0c800..ec5ebbb 100644
 1.7.7.3
 
   
+
+Hi all,
+ There is a risk of data loss with md/raid6 arrays running on Linux since
+ 2.6.32.
+ If:
+   - the array is doubly degraded
+   - one or both failed devices are being recovered, and
+   - the array is written to
+
+ then it is possible for data on the array to be lost.  The patch below fixes
+ the problem.  If you apply the patch to an older kernel which has separate
+ handle_stripe5() and handle_stripe6() functions, be sure that patch changes
+ handle_stripe6().
+
+ There is no risk to an optimal array or a singly-degraded array.  There is
+ also no risk on a doubly-degraded array which is not recovering a device or
+ is not receiving write requests.
+
+ If you have data on a RAID6 array, please consider how to avoid corruption,
+ possibly by applying the patch, possibly by removing any hot spares so
+ recovery does not automatically start.
+
+ This patch will be sent upstream shortly and will subsequently appear in
+ future "-stable" kernels.
+
+NeilBrown
+
+From f94e37dce722ec7b6666fd04be357f422daa02b5 Mon Sep 17 00:00:00 2001
+From: NeilBrown <neilb at suse.de>
+Date: Wed, 13 Aug 2014 09:57:07 +1000
+Subject: [PATCH] md/raid6: avoid data corruption during recovery of
+ double-degraded RAID6
+
+During recovery of a double-degraded RAID6 it is possible for
+some blocks not to be recovered properly, leading to corruption.
+
+If a write happens to one block in a stripe that would be written to a
+missing device, and at the same time that stripe is recovering data
+to the other missing device, then that recovered data may not be written.
+
+This patch skips, in the double-degraded case, an optimisation that is
+only safe for single-degraded arrays.
+
+Bug was introduced in 2.6.32 and fix is suitable for any kernel since
+then.  In an older kernel with separate handle_stripe5() and
+handle_stripe6() functions that patch must change handle_stripe6().
+
+Cc: stable at vger.kernel.org (2.6.32+)
+Fixes: 6c0069c0ae9659e3a91b68eaed06a5c6c37f45c8
+Cc: Yuri Tikhonov <yur at emcraft.com>
+Cc: Dan Williams <dan.j.williams at intel.com>
+Reported-by: "Manibalan P" <pmanibalan at amiindia.co.in>
+Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1090423
+Signed-off-by: NeilBrown <neilb at suse.de>
+
+diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
+index 6b2d615d1094..183588b11fc1 100644
+--- a/drivers/md/raid5.c
++++ b/drivers/md/raid5.c
+@@ -3817,6 +3817,8 @@ static void handle_stripe(struct stripe_head *sh)
+ 				set_bit(R5_Wantwrite, &dev->flags);
+ 				if (prexor)
+ 					continue;
++				if (s.failed > 1)
++					continue;
+ 				if (!test_bit(R5_Insync, &dev->flags) ||
+ 				    ((i == sh->pd_idx || i == sh->qd_idx)  &&
+ 				     s.failed == 0))
+
+
================================================================

---- gitweb:

http://git.pld-linux.org/gitweb.cgi/packages/kernel.git/commitdiff/df28e0131a913b1086c2eef4e7ac7d454652487d



More information about the pld-cvs-commit mailing list