packages: kernel/kernel-small_fixes.patch - nasty race fix

arekm arekm at pld-linux.org
Wed Oct 19 22:20:48 CEST 2011


Author: arekm                        Date: Wed Oct 19 20:20:48 2011 GMT
Module: packages                      Tag: HEAD
---- Log message:
- nasty race fix

---- Files affected:
packages/kernel:
   kernel-small_fixes.patch (1.42 -> 1.43) 

---- Diffs:

================================================================
Index: packages/kernel/kernel-small_fixes.patch
diff -u packages/kernel/kernel-small_fixes.patch:1.42 packages/kernel/kernel-small_fixes.patch:1.43
--- packages/kernel/kernel-small_fixes.patch:1.42	Tue Oct 18 12:03:21 2011
+++ packages/kernel/kernel-small_fixes.patch	Wed Oct 19 22:20:43 2011
@@ -1275,3 +1275,62 @@
 xfs at oss.sgi.com
 http://oss.sgi.com/mailman/listinfo/xfs
 
+I don't usually pay much attention to the stale "? " addresses in
+stack backtraces, but this lucky report from Pawel Sikora hints that
+mremap's move_ptes() has inadequate locking against page migration.
+
+ 3.0 BUG_ON(!PageLocked(p)) in migration_entry_to_page():
+ kernel BUG at include/linux/swapops.h:105!
+ RIP: 0010:[<ffffffff81127b76>]  [<ffffffff81127b76>]
+                      migration_entry_wait+0x156/0x160
+ [<ffffffff811016a1>] handle_pte_fault+0xae1/0xaf0
+ [<ffffffff810feee2>] ? __pte_alloc+0x42/0x120
+ [<ffffffff8112c26b>] ? do_huge_pmd_anonymous_page+0xab/0x310
+ [<ffffffff81102a31>] handle_mm_fault+0x181/0x310
+ [<ffffffff81106097>] ? vma_adjust+0x537/0x570
+ [<ffffffff81424bed>] do_page_fault+0x11d/0x4e0
+ [<ffffffff81109a05>] ? do_mremap+0x2d5/0x570
+ [<ffffffff81421d5f>] page_fault+0x1f/0x30
+
+mremap's down_write of mmap_sem, together with i_mmap_mutex or lock,
+and pagetable locks, were good enough before page migration (with its
+requirement that every migration entry be found) came in, and enough
+while migration always held mmap_sem; but not enough nowadays, when
+there's memory hotremove and compaction.
+
+The danger is that move_ptes() lets a migration entry dodge around
+behind remove_migration_pte()'s back, so it's in the old location when
+looking at the new, then in the new location when looking at the old.
+
+Either mremap's move_ptes() must additionally take anon_vma lock(), or
+migration's remove_migration_pte() must stop peeking for is_swap_entry()
+before it takes pagetable lock.
+
+Consensus chooses the latter: we prefer to add overhead to migration
+than to mremapping, which gets used by JVMs and by exec stack setup.
+
+Reported-by: Pawel Sikora <pluto at agmk.net>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Acked-by: Andrea Arcangeli <aarcange at redhat.com>
+Acked-by: Mel Gorman <mgorman at suse.de>
+Cc: stable at vger.kernel.org
+
+--- 3.1-rc10/mm/migrate.c	2011-07-21 19:17:23.000000000 -0700
++++ linux/mm/migrate.c	2011-10-19 11:48:51.243961016 -0700
+@@ -120,10 +120,10 @@ static int remove_migration_pte(struct p
+ 
+ 		ptep = pte_offset_map(pmd, addr);
+ 
+-		if (!is_swap_pte(*ptep)) {
+-			pte_unmap(ptep);
+-			goto out;
+-		}
++		/*
++		 * Peek to check is_swap_pte() before taking ptlock?  No, we
++		 * can race mremap's move_ptes(), which skips anon_vma lock.
++		 */
+ 
+ 		ptl = pte_lockptr(mm, pmd);
+ 	}
+
+  
================================================================

---- CVS-web:
    http://cvs.pld-linux.org/cgi-bin/cvsweb.cgi/packages/kernel/kernel-small_fixes.patch?r1=1.42&r2=1.43&f=u



More information about the pld-cvs-commit mailing list