[packages/kernel/LINUX_3_10] - added fix for Bad page map BUGs in Xen PVM
baggins
baggins at pld-linux.org
Wed May 7 11:51:02 CEST 2014
commit 89e907498be351208ebc9cb32065c7bd03b319c5
Author: Jan Rękorajski <baggins at pld-linux.org>
Date: Sun May 4 11:48:29 2014 +0200
- added fix for Bad page map BUGs in Xen PVM
kernel-small_fixes.patch | 107 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 107 insertions(+)
---
diff --git a/kernel-small_fixes.patch b/kernel-small_fixes.patch
index 2b0f82d..740b286 100644
--- a/kernel-small_fixes.patch
+++ b/kernel-small_fixes.patch
@@ -70,3 +70,110 @@ index 3b1ea34..eaa808e 100644
/* Ask for all the pages supported by this device */
result = scsi_vpd_inquiry(sdev, buf, 0, buf_len);
if (result)
+
+David Vrabel identified a regression when using automatic NUMA balancing
+under Xen whereby page table entries were getting corrupted due to the
+use of native PTE operations. Quoting him
+
+ Xen PV guest page tables require that their entries use machine
+ addresses if the preset bit (_PAGE_PRESENT) is set, and (for
+ successful migration) non-present PTEs must use pseudo-physical
+ addresses. This is because on migration MFNs in present PTEs are
+ translated to PFNs (canonicalised) so they may be translated back
+ to the new MFN in the destination domain (uncanonicalised).
+
+ pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
+ set and clear the _PAGE_PRESENT bit using pte_set_flags(),
+ pte_clear_flags(), etc.
+
+ In a Xen PV guest, these functions must translate MFNs to PFNs
+ when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
+ _PAGE_PRESENT.
+
+His suggested fix converted p[te|md]_[set|clear]_flags to using
+paravirt-friendly ops but this is overkill. He suggested an alternative of
+using p[te|md]_modify in the NUMA page table operations but this is does
+more work than necessary and would require looking up a VMA for protections.
+
+This patch modifies the NUMA page table operations to use paravirt friendly
+operations to set/clear the flags of interest. Unfortunately this will take
+a performance hit when updating the PTEs on CONFIG_PARAVIRT but I do not
+see a way around it that does not break Xen.
+
+Cc: stable at vger.kernel.org
+Signed-off-by: Mel Gorman <mgorman at suse.de>
+Acked-by: David Vrabel <david.vrabel at citrix.com>
+Tested-by: David Vrabel <david.vrabel at citrix.com>
+---
+ include/asm-generic/pgtable.h | 31 +++++++++++++++++++++++--------
+ 1 file changed, 23 insertions(+), 8 deletions(-)
+
+diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
+index 34c7bdc..38a7437 100644
+--- a/include/asm-generic/pgtable.h
++++ b/include/asm-generic/pgtable.h
+@@ -680,24 +680,35 @@ static inline int pmd_numa(pmd_t pmd)
+ #ifndef pte_mknonnuma
+ static inline pte_t pte_mknonnuma(pte_t pte)
+ {
+- pte = pte_clear_flags(pte, _PAGE_NUMA);
+- return pte_set_flags(pte, _PAGE_PRESENT|_PAGE_ACCESSED);
++ pteval_t val = pte_val(pte);
++
++ val &= ~_PAGE_NUMA;
++ val |= (_PAGE_PRESENT|_PAGE_ACCESSED);
++ return __pte(val);
+ }
+ #endif
+
+ #ifndef pmd_mknonnuma
+ static inline pmd_t pmd_mknonnuma(pmd_t pmd)
+ {
+- pmd = pmd_clear_flags(pmd, _PAGE_NUMA);
+- return pmd_set_flags(pmd, _PAGE_PRESENT|_PAGE_ACCESSED);
++ pmdval_t val = pmd_val(pmd);
++
++ val &= ~_PAGE_NUMA;
++ val |= (_PAGE_PRESENT|_PAGE_ACCESSED);
++
++ return __pmd(val);
+ }
+ #endif
+
+ #ifndef pte_mknuma
+ static inline pte_t pte_mknuma(pte_t pte)
+ {
+- pte = pte_set_flags(pte, _PAGE_NUMA);
+- return pte_clear_flags(pte, _PAGE_PRESENT);
++ pteval_t val = pte_val(pte);
++
++ val &= ~_PAGE_PRESENT;
++ val |= _PAGE_NUMA;
++
++ return __pte(val);
+ }
+ #endif
+
+@@ -716,8 +727,12 @@ static inline void ptep_set_numa(struct mm_struct *mm, unsigned long addr,
+ #ifndef pmd_mknuma
+ static inline pmd_t pmd_mknuma(pmd_t pmd)
+ {
+- pmd = pmd_set_flags(pmd, _PAGE_NUMA);
+- return pmd_clear_flags(pmd, _PAGE_PRESENT);
++ pmdval_t val = pmd_val(pmd);
++
++ val &= ~_PAGE_PRESENT;
++ val |= _PAGE_NUMA;
++
++ return __pmd(val);
+ }
+ #endif
+
+--
+1.8.4.5
+
+--
+To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
+the body of a message to majordomo at vger.kernel.org
+More majordomo info at http://vger.kernel.org/majordomo-info.html
+Please read the FAQ at http://www.tux.org/lkml/
================================================================
---- gitweb:
http://git.pld-linux.org/gitweb.cgi/packages/kernel.git/commitdiff/6450047e1a021f3960e5206903c5001257a366fa
More information about the pld-cvs-commit
mailing list